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ABSTRACT 
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While deployed at sea, sailors are traditionally provided much of their education 
at sea through correspondence and pace courses. But with recent developments in the 
Internet and videoconferencing, it is now feasible to deliver real-time educational 
material anywhere, even to a ship at sea. This thesis investigates the current status of 
networked desktop videoconferencing technology, and its use in support of Joint Vision 
2010, with respect to Distance Learning. It provides an analysis of videoconferencing 
protocols, standards, and applications, as well as a videoconferencing pilot project. The 
objective of the analysis is to determine the viability and economical benefits of using 
videoconferencing technology and collaboration tools, from the desktop, as a means for 
simultaneously delivering synchronous and asynchronous distance learning material from 
an academic location to multiple students at remote locations. The results show that 
desktop videoconferencing technology, via IP based networks in the Defense Information 
Infrastructure, is a viable tool that can add numerous economical benefits, such as a 
decreased spending for travel and eliminating the need to rely on large, room-based 
systems. 
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I. INTRODUCTION 



A. INTRODUCTION 

This thesis investigates the current status of networked desktop videoconferencing 
technology, and its use in support of Joint Vision 2010, with respect to Distance Learning. 
It provides an analysis of videoconferencing protocols, standards, and applications, as well 
as a videoconferencing pilot project. It also follows work from the thesis “ Internetworking : 
Economical Storage and Retrieval of Digital Audio and Video for distance learning, [Tiddy, 
96]. 



B. MOTIVATION 

DoD has implemented various videoconferencing systems in order to make distance 
learning more available, but there are still major obstacles. 

The current systems that have been put into place are usually based upon a model 
using a dedicated room or roll-about system, with proprietary hardware and software. Also, 
users are still required to travel to the room-based systems in order to participate in the 
training sessions. Surveys of room videoconferencing system users have identified desired 
features such as shared drawing area, the ability to connect to multiple sites, and ways to 
incorporate computer applications into the conference [Retinger, 95]. Since there can be a 
large geographical dispersion of military personnel across numerous time zones, there is 
also the problem of coordination of class times between the instructor and the student. 
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Using desktops to deliver videoconferencing has multiple advantages: As users 
become more familiar with the use of PCs, they will not need to learn how to provide 
instruction using a room based system, which usually requires a dedicated person to mange 
the equipment. The instructor does not have to deal with scheduling blocks of time to use 
the room-based systems. Conferencing over the desktop can be more relaxed and 
impromptu, contributing to better human interaction. Most desktop videoconferencing 
software has whiteboard capabilities, allowing the student and instructor to share data in 
real-time. 



C. OBJECTIVE OF THESIS 

The primary objective of this thesis is to describe how desktop videoconferencing 
technology and collaboration tools can be used either synchronously or asynchronously to 
deliver Distance Learning content over an IP based network to multiple students at remote 
locations. Instructors might be a Chief Petty Officer (CPO) at Fleet Training Center 
Pacific, an Admiral in Washington D.C., or a professor at the Naval Postgraduate School 
(NPS). The topics of desktop videoconferencing in regard to human/computer interaction 
aspect and social issues will not be discussed here, but can be found in [Rettinger, 95]. Test 
and evaluation of a prototype system at NPS provides an example demonstration how 
distance learning can be achieved via the PC to any remote user’s desktop. Specifically, the 
research and experiments for this thesis were designed to collect data to address the 
following research questions: 
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• How can we leverage the Defense Information Systems Network (DISN) to 
implement desktop videoconferencing distance learning to the sea? 

• What are some of the current protocols and standards available in order to 
multicast desktop videoconferencing applications via an IP based network? 

• How can we leverage the Navy’s current JMCOMMS/ADNS program to 
implement desktop videoconferencing distance learning to a shipboard LAN at 
sea? 

• What are the technical and management concerns in order multicast 
videoconferencing applications to the user at sea? 

• What impact will multicasting video over DISN have on the system 
bandwidth/availability? 

• What are the hardware and software requirements for the instructor and student, 
in order to maintain reliable communications throughout a 

course of instruction? 

• What are some of the available videoconferencing applications 
that can be used for distance learning? 

• How much will desktop videoconferencing (distance learning) offset travel 
expenses for resident education? 



Preliminary results are evaluated for each of these questions. 



D. SCOPE OF THE THESIS 

The scope of this thesis includes: (1) Show how multicasting across IP-based 
networks can be used to deliver desktop videoconferencing distance learning to sea. (2) 
Review some of the currently available videoconferencing products and how their use can 
be leveraged for distance learning, (3) Using a prototype, test and evaluate the feasibility of 
the delivery and storage of videoconferencing data over an IP based architecture to-sea. The 
goal is to evaluate and determine the economical and technical benefits of using currently 
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available desktop videoconferencing applications (versus cart and room-based systems) as 
an alternative tool that an instructor and student can use to exchange course material over an 
IP-based Internet and DISN. 

The demonstration incorporates desktop workstations with cameras, video capture 
card, audio card, and a network connection to IP multicast capable routers. Besides the 
standard Internet protocols normally found on current desktop computers, it also contains 
videoconferencing applications capable of multicasting video and audio, either 
synchronously or asynchronously, to naval students at remote locations and at sea. 



E. METHODOLOGY 

The methodology used to produce this thesis included the following tasks: 

• Conduct a literature search of books, magazines, articles, Internet resources and 
other library information services describing videoconferencing technology and 
current software/hardware that can be applied to distance learning in the military. 

• Conduct a search of books, magazines, articles, Internet resources, and consult 
with companies to determine the current videoconferencing software and 
hardware that are best suited for Intemet-to-the-sea videoconferencing. 

• Develop a model to demonstrate how distance learning courses can be 
seamlessly transported from the instructor to the Internet and the Navy’s 
communication networks infrastructure, in order to provide Intemet-to-sea 
videoconferencing. 
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• Develop a prototype videoconferencing system that might be used as a part of a 
“toolbox” that can be used export a correspondence course or graduate school 
class to a ship. 

• Consult with the Space and Naval Warfare Systems Command (SPARWAR) and 
the Research, Testing and Evaluation Division of the Naval Command Control 
and Ocean Surveillance Center (NRAD) on current developments of the Joint 
Maritime Communications System/ Automated Digital Network System 
(JMCOMMS/ADNS) and its current use with videoconferencing technology. 

F. THESIS ORGANIZATION 

This thesis is composed of eight chapters. This chapter provides the motivation, 
objectives, research questions, scope and methodology employed to conduct the research. 
Chapter II provides the history of videoconferencing, and related work. Chapter III 
discusses the current video and audio compression protocols and standards that are required 
for current videoconferencing systems. Chapter IV describes the various multicasting 
protocols and standards necessary to provide scalability, cross-platform support and quality 
of service (QoS) necessary to provide distance learning from the desktop over the 
commercial and naval IP based networks. Chapter V describes various options that can be 
applied over the DISN architecture that will support IP based desktop videoconferencing to 
sea. Chapter VI compares some of the desktop videoconferencing applications and 
protocols required to deliver distance education to sea. Chapter VII discusses the 
demonstration project and findings. Chapter VIII provides the conclusion, summary and 
recommendation for future research. 
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II. RELATED WORK 



A. INTRODUCTION 

This chapter provides a brief history of videoconferencing and the traditional 
methods used to provide distance learning to personnel in remote locations. It gives a brief 
overview of the various methods that can be employed to deliver distance learning across a 
network (WAN). Finally, it describes some of the current VTC/videoconferencing 
solutions used in the Navy and DoD. 



B. BRIEF HISTORY OF VIDEOTELECONFERENCING 

Videoconferencing was first introduced in 1926 when AT&T’s President, Walter S. 
Gifford, used Video Teleconferencing to speak with the Secretary of Commerce, Herbert 
Hoover. [Nerino, 94] Not until the late forties and early fifties, with the advent of the 
television, did the next major breakthrough in video technology come about. After 
television, videoconferencing did not see another major breakthrough until AT&T 
introduced its picture telephone at the 1964 New York World’s Fair. Even then, because 
videoconferencing contained frequencies that were beyond those used by telephone 
networks at that time, expensive satellites were used to provide the medium needed for high 
bandwidths required for videoconferencing. By 1983, full-bandwidth satellite transmissions 
still cost over $1 million per year [Nerino, 94], Today such satellite links are becoming 
more affordable. 
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As the 1970’s progressed, new advances in computing power and improved methods 
for converting analog signals to digital formats resulted in telephone service providers 
transitioning to digital transmission methods to compliment the existing analog processing 
systems. Although videoconferencing has become more widely used for services like 
business meetings, collaborative research, distance learning, etc., these service are generally 
performed over dedicated leased lines and usually requires expensive room-based or roll- 
about videoconferencing systems. 

Today, due to faster desktop computers and the rapid expansion of the World Wide 
Web and the Internet, transmitting real-time video using desktop computers to remote 
locations has become practical. Although there is currently an explosion in the number of 
applications that can transmit and receive streaming audio and video to and from a PC over 
the Internet, there continues to be significant interoperability, protocol and architectural 
issues that must be addressed if videoconferencing is to become commonplace from the 
desktop. 



C. DISTANCE LEARNING 

1. Traditional Educational Methods 

Educational development has been always been required in the career progression of 
naval personnel. This training is essential to achieving and maintaining national security, 
as well as national strategic objectives [Emswiler, 1995]. Traditionally, the primary 
methods of providing the necessary education to Naval personnel has been through the 
following methods: 
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• Short-term temporary duty seminars 

• Resident education at technical schools (A, B, and C schools) 

• Resident education at undergraduate or graduate educational institutions, 
e.g. Naval Postgraduate School (NPS) or Naval War College. 

• Postal-based correspondence courses by postal mail. 

Courses that require travel on a TAD basis are useful for initial or technical refresher 
type training nevertheless, this approach is costly and requires travel by the instructor, 
student or both. 

Resident education at NPS requires students to stay away from the operational forces 
for two years, on average. Although many courses require the student to be present to 
obtain the desired educational benefit, others can be easily and readily exported to sea or a 
remote shore location. 

Traditionally, postal -based correspondence courses have been necessary due to the 
remote locations that naval personnel are often stationed. If the course is the equivalent to a 
resident course, however, management of the correspondence course will be substantial. In 
order for the correspondence course to be successful, not only must there be a sustained 
commitment from the student, but the feedback loop to the student must be amenable to 
continuing, timely instruction. Often such a feedback loop is not the case, as sometimes it 
may be weeks, due to numerous reasons, before the student receives feedback or new 
modules. As a result many students do not finish. 
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2 . 



The Value of Distance Learning in the Navy 



Distance learning in the Navy can be beneficial in two important areas; cost and 
global reach. In a decreasing defense budget, the allocation of MILPERS dollars, which 
pay for travel and education, is ever decreasing. Besides costs, the naval environment 
requires personnel to be deployed at remote or isolated settings that are far from traditional 
educational resources. A more time efficient delivery of course material and feedback to 
the student can markedly improve the dedication of the student to complete the course of 
instruction. Figure 2-1 outlines the general situations when distance learning can be 
advantageous to traditional methods. 

• Target audience is widely scattered and it is not cost effective or possible to have 
them travel to a central training location. 

• Content or consistency in delivery is so critical that it must be carefully controlled 
for accuracy or correct interpretation. 

• Content is too dangerous for novices to participate in and distance education will 
allow for familiarization and confidence building prior to the actual situation. 

• Scheduling difficulties arise because the student cannot take extended time from 
other critical missions to attend a normally conducted training program. 

• The expense of conducting live training is cost prohibitive. 

• There are a limited number of qualified trainers. 

Figure 2-1 Productive applications of a distance education approach [Biggs, 94] 



3. Distance Learning via the Internet 

The World Wide Web (WWW) provides a means of providing both time-efficient 
course material and research tools. Distance education can be as a simple as a 
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correspondence course offered through electronic mail, something as complex as interactive 
video teleconferencing over the Internet, or combinations of both [Tiddy, 96]. 

As more ships, commands, and individual units become connected to local area 
networks (LAN’s) and wide-area networks (WAN’s), distance learning programs can be 
more easily implemented, ultimately providing more economical resources for training 
[Emswiler, 95]. Also, as video/audio application and transport protocols and standards 
become more established, commercially produced products become more readily available 
to furnish the tools necessary to provide distance learning over commercial and Department 
of Defense (DoD) networks. To date, however, the growth of Internet and DoD network 
applications and users are outpacing growth of bandwidth. With limited dollars for 
education and travel, DoD can not wait until this trend reverses itself. Therefore it is critical 
to use well-developed standards and protocols, i.e. multicasting, compression, etc., along 
with existing network infrastructures, in order to get the most efficient delivery to remote 
users. 



4. Videoconferecing in the Department of Defense 

DoD has used videoconferencing technology in a wide variety of applications. 
Some of the major areas where this technology is being used is in: 

• Training 

• Telemedicine 

• Group Conferences/Meetings 

• Crisis Response 

Videoconferencing technology has started to bring significant savings to DoD, 
mainly in travel expenses. The need for military personnel to travel to attend meetings. 
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conferences, training, and exercises has been greatly reduced for commands that have access 
to videoconferencing equipment. The following examples contain more specific 
descriptions of areas where videoconferencing technology is being or has been applied in 
DoD: 

a.) Training : NPS Distance Learning via the Multicast Backbone (MBone) 

system: NPS has conducted “Distance Learning” or remote classroom instruction, through 
the use of videoconferencing technology over the MBone. In a 1995 thesis by Tracy 
Emswiler, it was demonstrated that videoconferencing technology could be an economically 
feasible approach to distance learning. It documented Dr. Richard Hamming’s course, 
“Learning to Learn”, being transmitted worldwide over the MBone for an entire quarter. 
[Emswiler, 95] 

NPS is also currently delivering distance learning in Root Hall, using a PictureTel . 
4000 Video conferencing Systems over Integrated Services Digital Network, Basic Rate 
Interface (ISDN BRI) lines. Courses, and even some degree programs, are offered in 
Computer Science, Electrical Engineering, Aerospace Engineering and Information 
Technology Management. 

The Chief of Naval Education and Training (CNET) Electronic Schoolhouse 
Network (CESN) is a two-way video and audio multipoint, secure distance learning 
network. It allows simultaneous instruction to multiple shore and shipboard sites, where 
individuals can interact both verbally and visually in a real-time mode. Its purpose is to 
provide effective training to a large number of personnel at or near their duty stations, 
eliminating the need for travel to distant schoolhouses, thereby reducing travel and per diem 
costs. 
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The Navy’s Video Tele-training (VTT) CESN is linked via land lines and operates at 
a fractional T-l data rate of 384 Kbps. Communication is provided through the 
government’s long-haul communications network using FTS2000. Satellite capability is 
available for shipboard VTT. The network is made up of 16 sites nationwide and includes a 
site on board the USS George Washington [CNET, 97]. 

b. ) Telemedicine : This is a field where videoconferencing is making significant 

inroads. Basically, the same idea from distance learning is applied to telemedicine: A central 
care facility with medical expertise (i. e. physicians, surgical staff, etc.) can provide care 
“remotely” to a distant site via videoconferencing. A huge potential for this technology 
exists in afloat applications, since most U. S. Navy and Coast Guard ships have medical 
personnel who can provide a only a basic level of care. One practical use was demonstrated 
when Telemedicine was in used on the USS George Washington (CVN 73) to provide 
mental health examinations, during a 1997 deployment. Psychiatrists successfully evaluated 
onboard patients, capturing their mood, body language and response to questions [Koenig, 
97]. Additionally, during JWID 97, the Naval Medical Information Management Center 
(NMIMC), Bethesda, Maryland sponsored a demonstration of telemedicine technologies 
aboard the submarine USS Atlanta (SSN 712) in Norfolk, Virginia. Once ships are routinely 
outfitted with this technology, a tremendous benefit in Telemedicine will surely be realized. 

c. ) Group Conferencing : In September 1995, a major Joint Task Force (JTF) 

Exercise was conducted in Panama: Exercise “Fuertes Defensas” (Strong Defense). Led by 
the Commander, 18th Airborne Corps, this exercise was conducted to test United States 
readiness to support and defend the Panama Canal. Each day, the JTF Commander (an 
Army LGEN) was able to keep advised of exercise progress by conducting a morning 
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Videoconference with his Army, Navy, Air Force, and Marine Component Commanders. 
These commanders were sometimes physically separated by hundreds of miles. Because of 
videoconferencing technology, the commander was able to both remain well informed of 
exercise progress, and also was able to promulgate his own directives and intentions for the 
day. 

d.) Crisis Response : There is a huge potential for further use of 

videoconferencing technology for Crisis Response Management. For example. Navy and 
Marine Corps Afloat and Expeditionary Commanders might receive real-time combat 
instructions from their superiors via videoconferencing. Also, these Task Force 
Commanders might promulgate their own guidance to their attached ships and elements in 
the same fashion, all the way down the chain of command. There is also a large potential 
for this technology in non-combat crisis management situations, such as humanitarian 
disaster relief operations. 



14 



III. MAJOR VIDEOCONFERENCING STANDARDS 



A. INTRODUCTION 



This chapter will discuss the major videoconferencing standards, as they are 
significant issues when implementing distance learning to sea from the desktop. 



B. BACKGROUND INFROMATION 



The International Telecommunications Union (ITU), a body of the United Nations 
that focuses on developing standards, tasks the Telecommunications Standardization Sector 
(ITU-T) with developing telephony standards. It develops some of the major protocols that 
are used by IP-based videoconferencing systems today, such as H.320, H.323, and H.324. 
Table 3-1 provides an overview of those standards. 



Standard 


Description 


Remarks 


H.320 


H.320 is an "umbrella" 
standard that covers audio, 
video, videoconferencing, 
graphics, and multicasting 


Mandatory standard by the 
Federal Government in 1993. 


H.323 


Visual (audiovisual) 
communications over LANs 


Addresses audiovisual 
communications across LANs 
and gateways that connect 
LANs to the Internet. 


H.324 


Defines a multimedia 
communication terminal 
operating over the Switched 
Telephone Network. It 
includes H.261, T.120, and 
V.34. 


Incorporates the most common 
global communications facility 
today -- (POTS) 



Table 3-1 : ITU-T Videoconferencing Standards 
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The videoconferencing systems and standards described above can be viewed to 
have evolved over three generations. The 1 st generation systems were generally point-to- 
point, proprietary systems that usually required dedicated T-l (1.5Mbps) networks or better. 
Videoconferencing coding and compression was usually done by hardware 
compressors/decompressors (codecs). There were not many standards initially because 
interoperability of the various systems was not perceived as an issue. 2 nd Generation 
systems were driven by Integrated Services Digital Network (ISDN). The compression was 
also usually done by proprietary, hardware Codecs. As the technology matured, and 
compatibility became more of an issue, videoconferencing application developers began to 
adopt universal standards, ultimately migrating towards ITU-T’s H.320 protocol. Also, 
ISDN's inability to scale to a large number of users limited its acceptance. Today, as 
network-centric computing has migrated to the core of many organizations, compatibility 
has become a focal point in the development of videoconferencing systems, thus bringing 
about 3 rd generation system protocols. These new standards are generally designed to match 
the ISO seven-layer reference model. Now, advances in modeling and simulation (such as 
MPEG-4 compression), and improved scalability due to multicasting, 4 th generation 
standards are coming about. 



C. 1 st GENERATION STANDARDS 

1 st generation videoconferencing systems are usually large room-based systems that 
are connected via dedicated circuit switched or T1 connections. These systems are point-to- 
point, and use proprietary system standards to deliver and receive content. Additionally 
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they are not very scalable, and many of the international standards-based systems used today 
are not are not backwards compatible with them. Therefore they would be not be feasible 
for providing IP based distance learning to sea. 



D. 2 nd GENERATION STANDARDS 

I. H.320 

H.320 - "Narrow-Band Visual Telephone Systems and Terminal Equipment" is the 
umbrella standard that covers audio, video, videoconferencing, graphics and multicasting. 
ITU-T recommends it as the minimum standard that will ensure that videoconferencing 
systems will communicate with each other. H.320 covers a family of standards that governs 
videoconferencing systems that use coder/decoders (codecs) between 64 Kbps to 1920Kbps 
(64Kbps x 30). It became the mandatory standard for the Federal Government in 1993 
[Nerino, 94], 

The difference between the various videoconferencing systems will depend upon the 
optional requirements that each can support, which will ultimately effect the quality of the 
audio and video. How well the features are implemented is left up the each manufacturer. 
Table 3-2 shows H.320 recommendations and their titles. 
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Video Codec 


H.261: Video Codec for audiovisual 

services at p x 64 


Audio Codec 


G.711: Pulse Code Modulator (PCM) of 
Voice frequencies 

G.722: 7 Khz audio-coding with 64 Kbps 
G.728: Coding of speech at 16 Kbps using 
low delay code excited linear prediction 


Frame Structure 


H.221: Frame structure for a 64 to 
1920Kbps in audiovisual teleservices 


Control and Indication 


H.230: Frame-synchronous control and 
indication signals for audiovisual systems 


Communication Procedure 


H.242: System for establishing 

communication between audiovisual 

terminals using digital channels up to 
2Mbps 



Table 3-2: H.320 Recommendations [Nerino, 94] 



H.320 only requires vendors to support the minimum standards. When deciding 
between systems, there are currently three classes of videoconferencing systems: 

Class 1 - minimum level of support 

Class 2 - Class 1 + support of some optional features 

Class 3 - Class 1 + all optional features [VTEL, 95] 

The major factors that affect system quality are picture resolution, frame rate, 
preprocessing and postprocessing, motion compensation, audio, data rate and quality. 



a. Picture Resolution 

Picture Resolution is the frame format of the video, picture. The National 
Television Systems Committee (NTSC) standard picture frame consists of 780 horizontal 
picture elements (pixels) and 480 active vertical lines. Due to bandwidth constraints of the 
standard videoconferencing channels used today, that picture size is not practical for current 
videoconferencing systems. H.320 uses quarter common intermediate format (QCIF) - 176 
X 144 pixel resolution, and common intermediate format (CIF) - 352 X 288 pixel 
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resolution. If there is a connection between different classes of picture resolution, systems 
negotiate a resolution to the lowest one. 

b. Frame Rate 

H.320 can support frame rates of 7.5, 10, 15, and 30 frame per seconds (fps). 
Class 1 systems can support a frame rate of 7.5 fps; Class 2, typically about 15 fps, using 
QCIF; and class 3 supports 30 fps, using CIF. Frame rate negotiation uses the lower class 
when two or more classes are used. [VTEL, 95] 

c. Preprocessing and Postprocessing 

Preprocessing reduces the amount of re-coding in the background. If there is 
poor camera lighting, video “noise” can make the system think that there is motion in the 
background when in fact there is none. Preprocessing prevents the video encoder from 
wasting time encoding “noise” caused by the poor lighting, ultimately ensuring that only 
real motion gets encoded [VTEL, 95], 

Postprocessing compensates for the picture degradation due to fast motion. It 
can help reduce the “blocking” and noisy effects caused by video codecs (discussed in more 
detail under H.261). Postprocessing is also can be used to enhance the frame rate, thus 
reducing jerky motion [VTEL, 95], 

d. Motion Compensation 

Motion Compensation is another video quality enhancement. There are two 
aspects of motion compensation; motion estimation and actual motion compensation. 
Motion estimation is performed at the video encoder to determine the motion vector of the 
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subject. Motion compensation is performed at both encoder and decoder. It consists of 
moving blocks of video data around based on the motion vector determined during motion 
estimation. Especially important at lower bit rates, motion compensation moves only the 
encoded section of video where motion has occurred rather than the entire video area of 
each frame. All H.320 systems have the ability to decode a motion compensation signal. 
Providing encoded motion compensation (where the real video quality improvements are 
made) is optional [VTEL, 95]. 

Although the aforementioned factors affect H.320 system quality, many other 
elements also affect quality. Table 3-3 provides a summary of H.320 compliance. 





Level 1 
(Minimum) 


Level 2 
(Medium) 


Level 3 
(High) 


Frame Format 
(Pixels) 


QCIF 

(176 X 144) 


CIF 

(352 x 288) 


CIF 

(352 X 288) 


Frame Speed 
(frames/sec) 


5 


Up to 15 


Up to 30 


Data Rate 


56 / 64 Kbps 


Up to 
384 Kbps 


Up to 

1.544 Mbps 


Motion 

Compensation 


No 


Limited 
(6X6 = 36) 


Full Motion 
(30X30 = 900) 


Pre and post 

processing on both 
encoder and decoder 


Not 

Applicable 


Not 

Applicable 


Pre and post 
processing on both 
encoder and decoder 



Table 3-3: Levels of H.320 compliance [Nerino, 94] 



2. H.320 and ISDN 

ISDN is a connection-oriented circuit-switched digital communication service that is 
provided by telephone companies and network providers. It provides end-to-end digital 
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connectivity between local area networks (LANs). ISDN connects users to LANs and ca 
also connect LANs to widearea networks (WANs). The basic ISDN connection bandwidth 
is 128 kbps, split among two bearer (video, audio) channels at 64 kbps each. There is an 
additional 16kbps data channel that provides connectivity data. 

Implementation of ISDN channels is fairly flexible. Telephone companies provide 
services that allow ISDN channels to split (i.e. 64kbps channel split into two 32kbps 
channels to provide low-fidelity digitized voice), or bonded together. Bonding is 
accomplished by creating one logical channel out of multiple virtual channels. For example, 
the Navy’s Video Information Exchange System (VIXS) uses bonding to provide bandwidth 
of 1 12-384kbps in order to allow afloat and ashore nodes to conduct face-to-face meetings 
in real-time. 

ISDN offers improved videoconferencing connectivity over dedicated, point-to-point 
systems, because it works over existing phone lines and does not require the installation of 
an extensive network backbone. Unfortunately some major reasons remain why ISDN is 
not a good long-range alternative for distance learning. One is the lack of access to remote 
users in a globally dispersed military environment. Also, in order to multicast, you must 
deal with how the end points are going to be handled, i.e. adding multipoint control units 
(MCUs). Finally, continuing to implement ISDN as the primary videoconferencing long- 
haul architecture in the Navy is at odds with the Defense Information Infrastructure 
Common Operating Environment (DU COE) migration towards the consolidation of voice, 
video and data networks. 
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Recent versions of videoconferencing systems that use ISDN as its transport medium 
have begun to migrate to the H.320 protocol, but many vendors still use proprietary 
protocols in their videoconferencing systems. One possible reason: 

although the H.320 standard is technically sound, ISDN has had a poor showing in the 
marketplace, consequently bundling H.320 with ISDN has inhibited initial acceptance of 
H.320. 



E. 3 rd GENERATION STANDARDS 

1. Internet Videoconferencing 

As the Internet and client-server computing continued to grow, videoconferencing 
systems for LANs and WANs began to be developed. H.323 (an extension of H.320) covers 
videoconferencing over narrow-band WANs and also over LANs. Since H.323 is based 
upon the IETF’s Real-Time Protocol (RTP) — which will be discussed in more detail in 
Chapter IV — it can be applied to streaming video over packet-switched networks such as 
the Internet. H.323 also applies to point-to-point and multipoint sessions. Some of the 
other components of H.323 include: 

• Specifying messages for call control including signaling, registration and 
admissions, and packetization/ synchronization of media streams. 

• Specifying messages for opening and closing channels for media streams, and 
other commands, requests and indications. 

• H.261 (video codecs) 

• H.263 — Specifies a new video codec for video over POTS (< 64Kbps). 
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• G.7 1 1 , G.722, G.728 and G.729 standards 



• H.230 Frame Synchronous Control Standards 

• H.245 Link Control Standards 

• T. 1 20 Data Sharing Standard 

2. Plain Old Televeision System (POTS) 

POTS is the acronym for Plain Old Telephone Service. It utilizes the existing 
infrastructure of telephone lines and was designed to address the need for an inexpensive, 
high-quality solution for video conferencing over the existing infrastructure. The H.324 
standard addresses high quality video and audio compression over POTS modem 
connections. Specifically it addresses and specifies a common method for sharing video, 
data, and voice simultaneously using high-speed (V.34) modem connections over a single 
POTS telephone line. 

Video conferencing over POTS has been the least attractive of the medium options 
due to the bandwidth constraints. However, because H.324 incorporates the most common 
global communications facility today, POTS currently has a broad impact on the current 
marketplace. Even though the actual bandwidth of POTS hasn’t grown much, it is still 
becoming less of an obstacle, since today’s modem technology and data compression make 
it technically feasible to transmit both very low frame rate video and voice over a single 
line. As processors have become more capable, codec functions are now performed 
primarily in software, often achieving full-color, 15 frames per second (under optimal 
conditions), full duplex video and audio, with real-time responsiveness. Some of the major 
components of H.324 are: 
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• H.263 -- Defines speech coding at rates less than 64 Kbps. 

• H.261 - Video Compression from 64 to 2Mbps 

• H.223 — Defines a Multiplexing protocol for low bit rate multimedia terminals. 

• H.245 — Defines control of communications between multimedia terminals. 

• G.723 -- Defines speech coding for multimedia telecommunications transmitting 
at 5. 3/6.3 Kbps. 



F. VIDEO COMPRESSION 

In the past, due to the bandwidth constraints of terrestrial mediums, satellite was the 
traditional and reliable method for transporting videoconferencing between users. Due to 
technical improvements in routing and switching, however, optimal high-quality 
videoconferencing can also be realized with dedicated circuit-switched channels. 
Unfortunately, due the high cost and lack of widespread availability of these channels, most 
desktop computer users do not have access to a dedicated videoconferencing link that can 
transfer data at the necessarily data rates. The chief digital transportation medium that the 
average computer user has access to is the Internet, which is based upon a non-guaranteed 
bandwidth, packet-switching technology often connected to an end user via POTS. Even as 
more capable routers, switches and modems are used to deliver videoconferencing, 
providing coherent end-to-end video and audio streams across the Internet remains a major 
obstacle, due to lack of guaranteed bandwidth. Video and audio quality can be very poor 
due to Internet congestion, routing delay, packet loss/retransmission, packet constant 
rerouting, limited multicasting capabilities, and other factors. 
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One way to improve bandwidth is to the compress the data prior to its traversing a 
network. This can generally be accomplished using two types of data compression schemes: 
lossless and “lossy.” Lossless compression schemes are generally used in algorithms like, 
zip, gzip and gif file types. When using these types of algorithms, no data is lost during the 
compression and subsequent decompression of the data with approximations. The lossy 
compression algorithms search for and replace redundant data. Fortunately, due to the 
inability of the human eye to discern small losses of data in a digital image (notably the fact 
that small color details aren’t perceived as well as small details of light and dark) lossy 
compression techniques are very suitable for videoconferencing. 

There are a number of compression techniques available for use in 
videoconferencing, and H.261 is one of the most widely used in commercial 
videoconferencing products. Motion JPEG, Indeo, MPEG1, and MPEG2 are also 
prevalent. H.261 is optimized for bandwidth efficiency and low delay, whereas MPEG is 
less bandwidth efficient. MPEG is editable and provides the high visual quality required by 
movie-type applications. Indeo compression, offered by Intel, is optimized for low decode 
processing requirements. 

In order provide an appreciation of video compression algorithms, an overview 
H.261 will be given. Audio compression will not be discussed in detail since it uses the 
same basic principles used for video compression. A good reference for their details is 
[Rettinger, 95]. 

The H.261 is a widely used international video compression standard for 
videoconferencing that is designed for applications which use synchronous circuit switched 
networks as their transmission channels, e.g. ISDN. It was approved by the International 
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Telecommunication Union (ITU), (formerly CCITT) in 1990, and is currently used in 
conjunction with H.320, H.323 and H.324. H.261 is an interoperability standard that 

pertains to communication between encoders/decoders (codecs) used by videoconferencing 
systems. It is often called Px64, where P (1-30) represents multiples of frames sent at 
64Kbps. H.261 is similar to other “lossy” compression standards like JPEG, MJPEG and 
MPEG. Although similar to MJPEG and MPEG, JPEG is a compression standard used for 
still pictures, whereas MPEG and H.261 deal with motion video. Motion JPEG (MJPEG) 
generally uses H.261 techniques, such as Discrete Cosine Transform (DCT) encoding, 
quantization, macroblocks, etc. Using “lossy” compression algorithms, H.261 has provided 
a major advantage in dealing with the bandwidth constraints of various transmission media, 
without losing any significant picture quality (as least as far as the human eye is concerned). 
Although both MPEG and H.261 handle motion pictures, MPEG is designed to handle 
compressed bitstreams for the moving picture components of audio/visual services at rates 
from 0.9 to 1.5 Mbps. H.261, designed to target videoconferencing applications where 
motion is naturally limited, is specified from 64 Kbps to approximately 2 Mbps. 

Due to the computation-intensive algorithm used in codecs, in the early 
videoconferencing systems they were implemented in a separate piece of hardware. With 
today’s more powerful processors, however, the computations can be done by the 
computer’s onboard processor. 

H.261 uses Discrete Cosine Transform (DCT), to take advantage of the intraframe 
spatial and interframe temporal redundancy found in picture data. Spatial redundancy keeps 
track of the similarities in information in the same picture frame. It relies on a small number 
of bits to describe areas (pixels) on a picture that are the same color, therefore eliminating 
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the need to code each pixel for every transmission of data across the channel. Temporal 
redundancy, using motion compensation, takes advantage of similarities of information 
between adjacent frames in a group of moving pictures, therefore only pixels that have 
changed from one frame to the next are transmitted. In summary, DCT gets rid of 
redundant data bits in each block of picture frame data. 

H.261 also takes advantage of limitations in the human eye. Even though NTSC’s 
standard for transmitting moving pictures is 30 frames per second, the human eye can only 
discern movement up to about 24 frames per second. Actually, for the human eye even 15 
-25 frames per second is considered smooth motion. 

Using “lossy” compression algorithms in H.261 has provided a major advantage in 
dealing with the bandwidth constraints of various transmission mediums, without losing any 
significant picture quality. 
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1 . 



H.261 Structure 



Figure 3-1 depicts a flow diagram of a typical H.261 standards based system 
encoder. 




Figure 3-1 Encoder Flow Diagram [Jin, 96) 

Except for the first frame, when a picture sequence is sent to the encoder, it 
figures out whether the reference frame is going to be from the present picture frame or the 
previous frame. If the reference frame is the present frame, intra-frame (I-coding) will be 
performed. When using I-coding, the data will go directly through a discrete cosine 
transform (DCT) where it will be transformed from the spatial to the frequency domain. 
The DCT coefficients are then sent to a quantizer where each coefficient is expressed as a 
level from a finite number of predetermined levels. After quantization, a decision is made to 
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determine if the current macroblock (8x8 pixel array) is valid. Bit-rate control is 
performed, and eventually the bits are encoded and transmitted. 

If the reference frame is going to be from the previous frame, inter-frame (P-coding) 
is performed. Here, a motion vector search is performed to determine the right direction to 
begin the search for the nearest, most similar macroblock between the current (target) and 
previous (reference) frames. After a match is found, either no filtering (subtract the pixel 
values of the matched macroblock in the previous frame from those in the current 
macroblock) or filtering (subtract the pixel values of the filtered matched macroblock in the 
previous frame from those in the current macroblock) is performed. A differential pulse 
code modulator (DPCM) codes the difference between the successive values instead of 
coding the actual values. 
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Figure 3-2 H.261 Data Structure [Jin, 96| 



As shown in Figure 3-2, an H.261 video sequence begins with the picture frames, 
followed by Group of Blocks (GOB), Macroblocks, and Blocks. Each picture is divided 
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into twelve or three GOBs for the CIF or QCIF frame format, respectively. Thirty-three 
macroblocks are organized in a fixed 11x3 format to form a GOB. 

As shown in Figure 3-3, each macroblock consists of four 8x8 luminance 
(brightness) and two 8x8 chrominance (color) blocks. 




2. Discrete Cosine Transform (DCT) 

H.261 uses Discrete Cosine Transform (DCT), a form of frequency 
transformation which converts a signal from its spatial domain to its frequency domain in 
order to take advantage of the spatial and temporal redundancy in the picture data. Spatial 
redundancy keeps track of the similarities in information in the same picture frame. It relies 
on a small number of bits to describe areas (pixels) on a picture that are the same color, 
therefore eliminating the need to code each pixel for every transmission of data across the 
channel. Temporal redundancy takes advantage of similarities of information between 
adjacent frames in a group of moving pictures, therefore only pixels that have changed from 
one frame to the next are transmitted. Since the eye is more receptive to luminance than is 
to chrominance, bit representations of luminance both contain more bits and are sampled 
more frequently than the color components, which tend to be noisy. 
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In H.261, a two-dimensional DCT is performed on 8 x 8 pixel blocks (luminance and 
chrominance). Unlike the Discrete Fourier transform, all multiplications in the DCT use 
only real values, thus lowering the number of required computations. The 8x8 array is 
inputted into the DCT, and the output is an 8 x 8 array of DCT integer coefficients, with the 
number of nonzero values significantly decreased. This reduction in nonzero values is only 
the first part of the compression. For most images, much of the signal energy is in the lower 
frequencies, which appear in the upper left comer of the DCT array. The lower right values 
represent higher frequencies, and are often small enough to be neglected with little visible 
distortion. Figure 3-4 is mathematical model of the two-dimensional Discrete Cosine 
Transfom (DCT). 
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Figure 3-4 Two dimensional Discrete Cosine Transform [Jin, 96] 



3. Quantization 

The degree of quantization determines the image quality. A large quantization step 
size can produce unacceptably large image distortion. Similarly, too fine a step size can lead 
to lower compression ratios. The key challenge is to qauntize the DCT coefficients the most 
efficiently. H.261 does this by taking advantage of the limitations in the human eye’s ability 
to discern high frequencies. The quantization matrix is an 8 x 8 matrix of step sizes 
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(quantums), which provides an element for each DCT coefficient. As mentioned previously, 
step sizes in the upper left (lower frequencies) of the DCT array are small and are large in 
the lower right (high frequencies). The quantizer divides the DCT coefficients by their 
corresponding quantum and then rounds to the nearest integer. Large quantums drive the 
small coefficients down to zero, with the result that many high-frequency components easier 
to encode. The low-frequency components undergo only minor adjustments. Eventually, 
only the nonzero DCT coefficients that survive the quantization stage are encoded and 
transmitted. This quantizing is somewhat analogous to Mu-law and A-law non-uniform 
quantization, where the voice frequencies at the lower amplitudes (which we are more likely 
to encounter) will be conditioned to provide more information at a slight cost to information 
at higher amplitudes. 



4. Motion Compensation and Estimation 

When the motion of the source is generally limited, it is very likely that the 
luminance and chrominance blocks are not that much different between successive picture 
frames. In H.261, motion prediction is done on the luminance channel on blocks of 16 x 16 
pixels. There are two aspects that cover these similarities; motion prediction and motion 
compensation. Motion prediction is performed at the encoder to determine what the motion 
vector should be, whereas motion compensation consists of moving blocks of data around, 
based upon that motion vector. As shown in Figure 3-5, by vectoring the reference block 
and comparing its bit structure with the bit structure in the target block, it looks for the 
closest match. Consequently, only the difference in the pixel values between the current 
macroblock and its matched macroblock are encoded. 
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The reason why motion compensation is effective is because it moves only the 
section of video where motion has occurred, rather the entire video area for every frame. 
Essentially each frame can be reasonably coded by detecting the changes (which are usually 
very small) from the previous one. This functions is a very important aspect in lowering the 
bit rate. Before a reference frame can be established, intra-frame coding must be done. 




Figure 3-5 P-coding (interframe) [Jin, 96] 



Figure 3-6 shows how each macroblock is intra-frame encoded. The intra-frame is 
used as an accessing point. Figure 3-7 shows the frame sequencing used in H.261 . 
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Figure 3-6 I-coding (intraframe) [Jin, 96] 




Figure 3-7 H.261 frame sequence encoding |Jin,96| 

After quantization, it is not unusual for more than half of all of the DCT coefficients 
to be equal to zero. One coding scheme, run-length coding, is used to take advantage of 
this. In run length coding, except for the DC coefficients of the intra-coded blocks, all DCT 
coefficients are encoded using the run-length algorithm in a ziz-zag fashion, as shown in 
Figure 3-8. For each non-zero value, the number of zeros that preceded the number and the 
amplitude of the number itself form a pair. If the last nonzero value does not happen to be 
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the last coefficient in the block, an End-of-Block code is attached to tell the decoder that 



there are no more nonzero coefficients left in the 8 x 8 block. 




The coded pair will then go through a variable length encoding where each pair has 
its own code word, assigned through a variable length code. The basic idea is to assign 
shorter code words to represent more frequently occurring values and longer code words to 
the less frequent values, in order to compress data even further. Huffman coding is the most 
common. Many Huffman tables used for different types of data are specified in the H.261 
standard. 

H.261 is only the baseline video compression standard for videoconferencing. There 
are many faster and more efficient codecs, which are H.261 compliant, that use their own 
proprietary algorithms. Nevertheless, even a minimum H.261 compliant codec can provide 
tremendous compression ratios (well beyond 100:1). The table in Figure 3-9 shows how 
well data rates can be increased with a 100:1 data compression using H.261 compression 
standard. 
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BIT RATES REQUIRED TO TRANSMIT COMPRESSED VIDEO IN CIF 
AND QCIF FORMAT 
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Figure 3-9 Frame Rate vs. Bit Rate for compressed data 



5. MPEG 

Many of the compression techniques used in the H.261 standard are similar to those 
used in the MPEG-1, but there are three major differences: data structure, coding type, and 
frame ordering [Zin, 96], Because MPEG is targeted for more bandwidth-intensive 
applications than H.261, this thesis will not provide and in-depth description of MPEG 
standards. 



G. AUDIO COMPRESSION 

Audio compression standards are the most important function of videoconferencing 
systems, across all generations. Currently, Mu-law and A-law are the most common 
compression techniques used to condense audio data utilized in videoconferencing systems. 
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Both are non-uniform pulse code modulation (PCM) encoding techniques that use the 
quantized values of the samples in order present a discrete representation of the audio signal. 
Each sample represents a code word that is 8 bits in length. Mu-law and A-law 
transformations allow 8 bits per sample to represent the same range of values that would be 
achieved with 14 bits per sample using uniform PCM, which translates to a compression 
ratio of approximately 1.75:1. Due to the logarithmic nature of the transformation, the low 
amplitude samples are encoded with greater accuracy than the higher samples. 

Major techniques that are designed for audio signals: 

• G.7 11 -48-64 Kbps Narrow-band 

• G.722 - 48 - 64 Kbps Wide-band 

• G.723 - Speech coding at 5. 3/6.4 Kbps 

• G.728 - 16 Kbps Narrow-band 

ITU-T recommendation G.711, “Pulse code modulation of voice frequencies” 
provides telephone quality audio (narrow-band 3khz). 

G.722 provides stereo quality (wide-band 7khz). At a typically higher data rate, 
usually > 256 Kbps, it provides the best audio quality available. [VTE, 95] G.722 uses 
adaptive differential pulse code modulation (ADPCM), which uses predictive algorithms to 
predict the values of adjacent samples. It uses the difference between the predicted and 
actual sample and encodes the difference. The adaptive part is because the encoders can also 
adapt to changing quantizing or prediction parameters. ADPCM generally achieves ratios of 
2:1 as compared with Mu-law or A-law 1.75:1. G.722 has three modes of operation: 64, 

56, and 48 Kbps. If a 64 Kbps communication channel id used, 48 or 56 Kbps modes will 
have an additional 8 or 16 Kbps of bandwidth for other data. 
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For audio over narrow-band POTS lines, there’s G.723, which supports a 
compressed 3.4khz signal. If defines speech coding for audio transmitted at 5. 3/6.4 Kbps. 

G.728 provides narrow-band audio, which is important for lower bit rates < 256 
Kbps. It is designed specifically for speech signals. G.728 uses another type of predictive 
coding called code excited linear prediction (CELP), which requires a bandwidth of 16 Kbps 
and is very computationally complex, requiring special hardware. 

As described in H.320, if two different classes of audio compression are used, the 
less capable of the two will be used. For example, if a Class 3 system (G.728) establishes a 
call with a Class 1 system, the audio will be G.71 1. [VTEL, 95] 



H. DATA STANDARDS 

The T.120 standard focuses on collaborative computing, common whiteboard, and 
applications sharing during any H.32x videoconference. It defines the communication and 
application protocols and services that support real-time multipoint data communications. 
The specification also allows data-only T.120 sessions, when no video communications are 
required. In addition, T.120 supports multipoint meetings with participants using different 
transmission media. T. 120 recommendations include: 

• T. 122 Multipoint Communication Service 

• T. 123 Network Specific Transport Protocols 

• T. 124 Generic Conference Control 

• T.126 Still Image Exchange 
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I. SUMMARY 



As network architectures have evolved, newer standards are continually 
implemented. But in order to provide cross-platform capability, flexibility, scalability, and 
accommodation of newer technologies as they emerge, the protocols and standards used in 
videoconferencing for distance learning must be compatible with the standards from the 
International Standards bodies. These standards should be the baseline used in 
videoconferencing systems for distance learning. 

Using commonly available software codecs, not only will network bandwidth 
improve over already strained data pipes, but the allows for storing more data in a PC’s 
storage device(s). This provides the ability for more course material to be streamed-on- 
demand, providing the asynchronous capability necessary for distance learning to sea. 

Although software video codecs lack the compression speed of dedicated codecs, 
they have the advantage of low cost. Furthermore, more powerful processors like Intel’s 
Pentium II with MMX technology improve video compression/decompression. 
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IV. IP MULTICASTING AND THE MBone 



A. INTRODUCTION 

This chapter focuses upon multicasting videoconferencing sessions over IP-based 
networks. It must be noted however that IP is very flexible. It can be used over a variety of 
network segments, including ATM, frame relay, switched multimegabit data service 
(SMDS), satellite, dial-up asynchronous, and ISDN. This chapter also discusses the major 
protocols supported by The IP Multicast Initiative (IPMI). Founded in 1996, the IPMI is a 
multi-vendor cooperative effort to promote the deployment of industry-standard IP 
Multicast technology, many of which are IETF Requests for Comment (RFC). Many 
members are leaders in the high technology industry including IBM, Intel, Microsoft, Cisco 
Systems, Silicon Graphics, and GTE, among others. 



B. BACKGROUND 

As shown in Chapter II, videoconferencing compression algorithms help reduce 
network bandwidth requirements, allowing videoconference applications to deliver real- 
time, quality video and audio data across networks. But compression solves only one area 
of the bandwidth issue. For example, what if a videoconferencing application needed to 
send data to multiple hosts simultaneously? One way to accomplish that task would be to 
retransmit identical IP packets to each recipient. If there are many recipients, this could 
potentially strain the network. To avoid this problem, the Internet Engineering Task Force 
(IETF), an arm of the Internet Architecture Board (IAB) that approves Internet standards. 
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endorsed IP multicast as a standards-based solution to this problem. There are two items 
that make multicasting practical on the Internet. They are the lack unlimited bandwidth on 
the Internet backbone connections, and the widespread availability of workstations across a 
wide global network infrastructure [Macedonia, Brutzman 94]. 



C. IP MULTICASTING 

RFC 1112, “Host Extensions for IP multicasting,” authored by Steve Deering in 
1989, was designed as an extension of IP Version 4. It is described as “the transmission of 
an IP datagram to a “host group”, i.e. a set of zero or more hosts identified by a single IP 
destination address [Johnson, 97]. IP multicast allows applications to send data over the 
Internet to many simultaneous recipients in a more economical fashion than unicast or 
broadcast IP transmissions. Unicast IP is from a single source to a single destination (one- 
to-one), so in order to send information to multiple recipients using unicast, an application 
needs to send multiple copies of IP datagrams, which might saturate the transmission 
medium. Broadcast IP sends data to all of the participants in a network whether they want it 
or not. 

When Internet Protocol (IP) was developed. Class D IP addressing was designed to 
facilitate multicasting. Unlike unicast IP addresses, which identify specific destinations. 
Class D addresses identify a particular transmission session. Class D addresses are reserved 
for groups rather than individual hosts. The addresses range from 224.0.0.0 to 
239.255.255.255. 
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There are also certain special addresses (listed in RFC 1700 - “Assigned Numbers”): 

• 224.0.0. 1, the “all host group” — addresses all multicast hosts on a directly 
connected net. 

• 224.0.0.2 addresses all routers in a LAN. 

• 224.0.0.0 through 224.0.0.225 is reserved for routing protocols and other low- 
level topology discovery or maintenance protocols. 

• 224.0.1.3 through 224.0.13.255 is reserved for Network News. 

With IP multicast, the source application is not necessarily aware of the 
destinations. Multicast applications send one copy of an IP packet over the network to a 
group address. A group of receivers may then participate by joining the particular multicast 
session group. The multicast IP datagram is delivered to all members of its destination 
host group (group Class D address) with the same ‘best effort’ reliability as regular unicast 
IP datagrams [Johnson, 97]. 

Some of the rudimentary requirements of IP multicast are: 

• Since hosts may leave or join a group at anytime, membership in a host group of 
an IP multicast session must be dynamic. 

• There should be no restrictions on the location and number of groups that can 
participate. 

• At the application level, a host may have multiple data streams on different port 
numbers, on different sockets, in one or more applications [Johnson, 97]. 

The minimal hardware/software requirements needed to deliver IP multicasts end-to- 
end are: 
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• Support for IP multicast transmission and reception in the host’s TCP/IP protocol 
stack and operating system. 1 

• Software supporting Internet Group Management Protocol (IGMP), in order to 
communicate requests to join a multicast groups(s), and receive multicast traffic. 

• Network interface cards that efficiently filter for LAN data link layer addresses 
that are mapped from network layer IP multicast addresses. 

• IP multicast application software such as videoconferencing or file transfer. The 
end-node applications should be flexible in terms of their support for existing 
compression technologies and accommodation of newer technologies as they 
emerge. 

• Intermediate routers between the sender(s) and receivers(s) must be IP multicast- 
capable. 2 



• Firewalls (i.e. packet-filtering software) may need to be reconfigured to permit 
IP multicast traffic. [Johnson, 97] 

Figure 4-1 is an overview of the requirements. 
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Figure 4-1 Requirements for IP Multicasting 



1 Windows NT, Windows 95, and the latest versions of UNIX support IP multicast. 

2 Multicasting capability can be enabled in most routers by simply updating the software and adding memory. 
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When a host application requests membership in the host group associated with a 
particular multicast session, the request is communicated to the subnet’s multicast router 
and, if necessary, on to intermediate routers between the sender and receiver. When the 
requested session is found, the router delivers the requested incoming multicast IP 
datagrams to the requesting host, passing it to the TCP/IP stack, which makes the data 
available as input to the user’s application. Other stations filter out multicast packets at the 
hardware level. 

Multicast routers do not need to know the list of member hosts for each group. It 
only requires knowing a group for which there is one member on the subnet. A multicast 
router attached to an Ethernet need associate only a single Ethernet multicast address with 
each host group having a local member. 

1. IP Multicast Protocols 

Like any other means of transporting data over network infrastructures, IP multicast 
comes with an array of protocols that help provide the framework for multicasting IP 
datagrams. The most fundamental of IP multicast protocols, Internet Group Management 
Protocol (IGMP Ver. 2), described in RFC 2236, is used by multicast routers in order to 
learn the existence of host group memberships. It is the baseline protocol necessary to 
conduct an IP multicast session. 

The protocols used to ensure that the needed bandwidth and QoS are available 
include Real-Time Transport Protocol (RTP), Real -Time Control Protocol (RTCP), Real- 
Time Streaming Protocol (RTSP), and Resource Reservation Protocol (RSVP). There area 
also associated routing protocols such as Protocol Independent Multicast (PIM), Multicast 
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Open Shortest Path First (MOSPF), and Distance Vector Multicast Routing Protocol 
(DVMRP). 

There are also transport issues that need to be addressed with IP multicast. 
Applications that are IP multicast capable are not designed for use with reliable, connection- 
oriented transports (TCP), therefore layer 3 does not invoke destination addresses in the 
datagrams. They also do not require guaranteed in-sequence delivery of IP packets. 
Furthermore, since the delivery of IP will not have a fixed path, there is no assurance that 
the bandwidth needed for video and audio will be available. Videoconferencing 
applications are better off tolerating missing data than overcoming the lengthy delays caused 
by TCP retransmissions. Therefore, a simpler transport framework, such as User Datagram 
Protocol (UDP), a transport layer protocol that only provides error detection, does a more 
than an adequate job of transporting videoconferencing data. 

a. Internet Group Management Protocol (IGMP) 

Internet Group Management Protocol (IGMP) performs two main functions. 
It is used by hosts to join IP multicast sessions, and by multicast routers to learn the 
existence of host group members on their directly attached subnets, identify designated 
multicast routers in a LAN, and propagate group information over the Internet. It is loosely 
analogous to Internet Control Message Protocol (ICMP), which is used in PING 
applications. [Johnson, 97] 

Each multicast router sends IGMP queries (Host Membership Query), and 
the hosts respond by reporting their host group memberships (Host Membership Report). 
This query and response session is accomplished by IGMP messages encapsulated in IP 
datagram packets. To determine if any hosts on a local subnet belongs to multicast group. 
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one multicast router per subgroup periodically sends a hardware (data link layer) multicast 
IGMP Host Membership Query (network address 224.0.0.1) to all IP end nodes on its 
subnet. This message asks them to report back on the host group memberships of their 
processes. These query messages have a time to live (TTL) of 1 to limit their transmission to 
the network directly attached to the router. [Petitt, 96] 

Each host then sends back one IGMP Host Membership Report to the group 
address, so that all group members see it. When hosts see a Host Membership Report for 
the group transmitted, they cancel their own transmission. Hence, only one member of the 
group will report membership to the router for a particular group address. Periodically, 
local multicast routers will send IGMP Host Membership Queries to the “all hosts” group, to 
verify current memberships. Although IGMP packets are routinely transmitted, compared to 
the multicast application’s traffic, its bandwidth use is insignificant. Figure 4-2 shows an 
IGMP request on a LAN. 




Figure 4-2 IGMP Messages on a LAN [Johnson, 97] 
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When the last station on a subnet leaves a multicast group, the router 
“prunes” the multicast data stream associated with it by ceasing to forward the data stream 
to subnet. 



b. Real-Time Transport Protocol Version 2 (RTP) 

Real-Time Transport Protocol (RTP), defined in RFC’s 1889 and 1890, 
provides end-to-end delivery services to support applications transmitting real-time 
data. [Johnson, 97] Among the services that RTP provide are payload type identification, 

packet sequence numbering, and time stamping. The delivery of RTP packets is monitored 
by Real-Time Control Protocol (RTCP), which is discussed later. 

RTP does not provide all of the typical functionality of typical transport 
protocols. It is a header format running in combination with other transport protocols in 
order to take advantage of their functionalities. The RTP header provides timing 
information to synchronize and display audio and video data, and also to determine if 
packets are lost or arrive out of order. In order to allow multiple data and compression 
types, the header specifies the payload type by characterizing what type of audio and video 
encoding is carried in the RTP packet. This enables users to have the option to change the 
encoding methods during a conferencing session, in response to network congestion, or to 
accommodate low-bandwidth requirements of a new conference participant [Johnson, 97]. 

RTP does not ensure timely delivery or provide QoS guarantees. It does not 
guarantee delivery or prevent out-of order delivery, nor does it assume that the underlying 
network is reliable. For applications like videoconferencing that require these types of 
guarantees, RTP must be accompanied by other mechanisms [Johnson, 97]. 
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c. Real-Time Control Protocol (RTCP) 

RTCP, also standardized in RFC’s 1889 and 1890, is a control protocol that 
works in conjunction with RTP. The information, periodically transmitted by each 
participant in an RTP session to all other participants, is used by the applications to control 
the performance of the conference and for diagnostic purposes. 

RTCP performs four primary functions, a) First, RTCP provides feedback 
information about the quality of the transmission to the applications. The statistics include 
the number of packets sent, the number of packets lost, interval jitter, etc. b) RTCP also 
identifies the RTP source address through its transport-level identifier called the canonical 
name (CNAME). The CNAME is used to keep track of participants in a session in order to 
synchronize audio and video. c) RTCP controls its transmission intervals in order to 
prevent control traffic from overwhelming network resources. RTCP control traffic is 
limited to five percent of the overall session traffic. This control on RTCP allows RTP to 
scale up to a large number of session participants, d) An optional function can be used to 
convey a small amount of information to all session participants. In distance learning, this 
information can be used to identify the participants in a particular training session. For 
example, RTCP might carry a personal name to identify a participant on the user’s display. 
[Johnson, 97] 

Since RTCP sends feedback to all of the recipients of a multicast stream, 
individual users can determine if a problem is specific to the local end node or system-wide. 
RTP and RSVP information is simply data from the point of view of the routers that move 
the packets to their destinations. To prioritize data streams and provide a guaranteed quality 
of service, other protocols must be used [Steinke, 96]. 
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d. Resource Reservation Protocol (RSVP) 

In an Internet environment with a myriad of routers and switches, packet 
queuing can lead to variable packet delivery delays in different parts of the network. QoS 
considerations for a multicast application include tolerance to jitter, delay, and lost packets. 
In order for the network to provide QoS, applications must be able to reserve and control 
network services [Johnson, 97]. This is not an issue on networks with sufficient bandwidth, 
but considering the packet-based networks targeted for use in this thesis, QoS is a major 
issue. 

The Resource Reservation Protocol (RSVP) is a draft protocol for resource 
reservation, still under development [Hurwitz, 97]. Elementary RSVP requests consist of 
dynamic request specifications for end-to-end desired QoS and definitions of the set of data 
packets to receive the QoS. It aims to efficiently set up a guaranteed QoS resource 
reservation, supporting unicast and multicast routing protocols, and is expected to scale well 
for large multicast delivery groups. RSVP is useful in environments where QoS 
reservations can be supported by reallocating (rather than adding) resources. In IP 
multicast, a host sends an IGMP message to join the group and then sends an RSVP 
message to reserve resources along the delivery path(s) of that group. The RSVP service 
request is initially sent to a local server. The local server will validate the request and then 
forward the request. 

RSVP promises access to Internet integrated services. The hosts and the 
network work together to achieve guaranteed quality of end-to-end transmission. However, 
in order to achieve end-to-end QoS, all hosts, routers and other network infrastructure 
elements between the receiver and sender must support RSVP. They must reserve system 
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resources such as bandwidth, CPU and memory buffers in order to satisfy QoS reservations. 
RSVP rides on top of IP, and is used by routers to deliver QoS control requests to all nodes 
along the path(s) and to establish and maintain statistics in order to provide the requested 
services. After the reservation has been made, the router supporting RSVP determines the 
route and QoS class for each incoming packet and the scheduler makes forwarding decisions 
for every outgoing packet [Johnson, 97]. 

Since RSVP is receiver initiated, resource requests are in only one direction. 
At each node along the reverse path to the receiver, RSVP attempts to make a resource 
reservation for the requested stream. This receiver-initiated propagation delivers control 
messages only up to the node of the spanning tree where they merge with another 
reservation for the same source stream, thus preserving bandwidth. This receiver initiation 
achieves two goals: scalability, because the receiver-initiated joining delivers control 
messages only along those parts of the tree that need the information; and heterogeneity, 
because of the receiver orientation, individual receivers can choose to participate and 
request different levels of reservation. [Precept, 97] 

Based upon the admission and policy controls of the underlying hardware, at 
each node, one of two general actions take place: The host makes a reservation or forwards 
the request upstream. These controls are not a part of RSVP, but are utilized by the 
equipment. Admission controls determine whether the node has sufficient resources, and 
the policy control determines whether the user has authorization to make a reservation. If 
the reservation is rejected, RSVP returns an error message to the appropriate receiver(s). If 
accepted, the node is configured to provide the desired QoS. If the RSVP request is 
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forwarded upstream, it continues to propagate along the reverse path towards the appropriate 
senders. [Johnson, 97] 

One drawback of RSVP is the computational requirements required by 
routers to inspect and handle packets in a priority order. Approaches such as tag switching 
are being developed to help with this drawback. Another area of research is enhancing 
RSVP to use routing services that provide alternate and fixed paths. Finally, RSVP has no 
way to handle network overload that may occur if multiple users request the maximum 
bandwidth at the same time [Andrews, 97]. 

RSVP continues to be under review by the Internet Engineering Task Force 
(IETF), and is not widely deployed. Similar work has been done on Internet Protocol- 
Version 6 (IPv6) to support resource reservation and flow set up for multicasting. Figure 4-3 
is an illustration of the method. 
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Figure 4-3 RSVP Protocol [Johnson, 97] 



e. Real-Time Streaming Protocol (RTSP) 

RTSP is considered more of a framework than a protocol. It works at the 
application level for unicast and multicast streaming and to enable operability between 
different vendors’ clients and servers. RTSP essentially encodes and passes multimedia 
stream control commands. In many respects, it resembles a protocol that describes the 
functionality of a VCR remote control. 

2. Reliable IP Multicast 

Reliable connectivity ensures that all packets are received by all of the recipients. 

For unicast IP services, error correction and detection is provided at the TCP layer. But 
such traditional techniques for error detection and correction in a large-scale multicast 
environment might result in an “ACK explosion” or a “NAK” implosion, where the 
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excessively large numbers of acknowledgement messages from large groups can swamp the 
originating hosts sending the desired streams. 

There are currently no IETF standards for reliable IP multicast, but several Internet 
drafts have been submitted related to reliable multicasting, and an IRTF (Internet Research 
Task Force) working group has been formed to advance reliable multicast standards efforts 
[Johnson, 97]. 

Cisco’s proposed Pretty Good Multicasting (PGM) Reliable Transport Protocol is 
intended to make multicasting appropriate for mission-critical uses. Although this work is 
still under development, this protocol can be useful in areas such as common tactical 
pictures. 

As mentioned previously, videoconferencing applications are able to tolerate missing 
data and still provide discemable video and audio. They also do not require guaranteed in- 
sequence delivery of IP packets. Therefore, videoconferencing end-systems will not need 
bit-perfect, in-order, acknowledged data. For military purposes, the multicast reliability 
requirement is more essential with the common tactical architecture and cooperative 
engagement issues. (Petitt, 96) evaluates the design choices of several reliable transport 
layer multicast protocols that support those requirements. 



3. Group Setup Protocols 

Users of videoconferencing must not only know about upcoming or current IP 
multicast sessions, but also how to manage and coordinate them. Parameters for sessions 
will include information such as the name and topic of the session; its multicast address; 
date, time and duration; media types (e.g. audio), media encoding, and media ports; security 
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parameters; etc. There are currently several Internet drafts for these types of protocols, but 
a clear standard has not emerged. Still there are current tools available. For example, the 
session directory tool, sdr, is widely used on the MBone. Similarly, Precept’s IP/TV 
Program Guide has a directory embedded in a Web page. 



4. Other IP Multicast Issues 
a. Router Support 

As with routing any IP datagram, multicasting requires routers to interact 
with each other and exchange information about their neighbors. One item that should be 
considered, in order to most effectively implement IP multicast, is to determine what is the 
best possible routing protocol based upon the network layout. On a routed network, which 
includes native multicast, IP multicast traffic for a particular source and destination group is 
typically transmitted via a spanning tree that connects all of the hosts in the group. There 
are basically two approaches to multicast routing; Dense-Mode or Sparse-Mode. 

Dense-Mode multicast routing protocols follow an approach that assumes 
that the multicast group members are densely distributed throughout the network and 
bandwidth is abundant. These protocols rely on periodic flooding of the network with 
multicast traffic to distribute group membership information to all nodes in the network in 
order to set up and maintain the spanning tree. The protocols include Multicast Open 
Shortest Path First (MOSPF), described in RFC 1584, Protocol-Independent Multicast- 
Dense Mode (PIM-DM), and the earlier Distance-Vector Multicast Routing Protocol 
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(DVMRP), described in RPC 1075. DVMRP is currently used on the MBone, but is 
becoming obsolete. 

Sparse-Mode protocols are based upon the assumption that the multicast 
group members are sparsely distributed throughout the network and bandwidth is not 
necessarily widely available. Flooding in this case is not economical because the waste of 
bandwidth and latency problems that occur when transmitting IP over large geographic 
regions. Sparse-Mode routing protocols like Core Based Trees (CBT), RFC 2189, and 
Protocol -Independent Multicast-Sparse Mode (PIM-SM), RFC 2117, are possible choices. 
They build a single distribution tree, which is formed around a focal router (called a core in 
CBT and rendezvous point in PIM-Sparse Mode). Multicast traffic for the entire group is 
sent and received over the same tree, regardless of the source. The use of a shared tree can 
provide significant bandwidth savings for applications that have many active senders. 

Another concern is that many Internet Service Providers (ISP’s) do not have 
a protocol to deal with inter-domain multicast routing (IDMR). IDMRs such as Protocol 
Independent Multicast (PIM), Multicast Open Shortest Path First (MOSPF), and Distance 
Vector Multicast Routing Protocol (DVMRP), were not designed for multiple autonomous 
systems that do not necessarily want to share all their routing information. [Hurwicz, 97] 

Although Border Gateway Protocol (BGP) provides inter-domain routing 
capabilities for IP, there is no equivalent of BGP for IP Multicast. Currently the EETF’s 
IGMP working group is developing a Border Gateway Multicast Protocol (BGMP) protocol 
specification. Until this shortcoming is addressed, the lack of an IDMR protocol limits to 
the scalability of IP Multicast, along with limited bandwidth, is one of the major reasons 
why MBone has only about 30,000 users. Furthermore, growth will continue to be limited if 
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all of the routers will have to contain all of the routing information for the whole network 
[Hurwicz, 97]. 

Although new routers on the Internet are capable of supporting multicast, 
most are not IP multicast enabled, by default. Many ISPs are reluctant to deploy multicast 
because of concerns such as: cost and complexity of upgrading older routers, router 
resources consumed, reliability problems, an unclear business model (how does an ISP 
charge for traffic, who pays, and how does peering — communications between ISPs — 
work?), and lack of diagnostic/simulation/debugging tools. Even with these concerns, some 
ISP’s have already deployed multicast. For example, UUNET offers IP multicast as a 
value-added service on its network. It has equipped each of its domestic Point-of-Presence 
(POP) with multicast routers, in order to provide multicast service connections throughout 
the continental United States. By next year, expect more ISPs to begin implementing 
multicasting, especially as backbone traffic continues to rise and cost threshold of user 
decreases. 

There is also the issue of incorporating QoS routing with various multicast 
routing protocols. Native IP multicast protocols uses various approaches to construct 
delivery trees for efficient transmission. But without additional mechanisms, those routing 
approaches are not guaranteed to provide a specified QoS. For example, when QoS 
mechanisms are used to reserve and control network resources, the routers must not only 
satisfy the added QoS requirements, but in addition, it has to find the shortest path to a 
destination when constructing a delivery tree. 
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b. Other Network Issues 

Many IP multicast implementations have not been thoroughly tested because 
many organizations have not enabled multicast capabilities in their networks [Hurwicz, 97]. 
Furthermore, there is no widely known data on how routers will react to a steady, high 
volume of multicast multimedia traffic. Because IP Multicast uses the connectionless User 
Datagram Protocol (UDP), the most popular type of firewall, application gateways, can not 
secure connectionless protocols, essentially rendering IP multicast incompatible with most 
firewall strategies. In some applications, in order to allow transmissions through a firewall, 
TCP is used in conjunction with UDP, by tunneling and the ported mulitcast routing 
program running on a host. Many firewall applications and routers will need to be 
reconfigured, replaced, or upgraded in order to deal with multicast address reliability and 
bandwidth issues. 

D. MULTICAST BACKBONE (MBone) 

When LANs, WANs, and the Internet were initially developed and designed, 
videoconferencing was not expected to be a viable possibility. Based of limited bandwidth, 
sending video or audio was not considered possible or practical. However, as the 
technology matured, the Multicast Backbone (MBone) and video/audio compression 
techniques were developed showing that videoconferencing was not only possible but also 
practical. 

The MBone is an experimental, virtual network that lies on top of the Internet. It 
was initiated in early 1992 and named by Steve Casner of the University of Southern 
California Information Sciences Institute. It provides one-to-many and many-to-many 
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network delivery services for multicast capable applications such as videoconferencing. 
MBone originated from a collaboration in order to multicast audio and video from meetings 
of the Internet Engineering Task Force, and has been the testbed for many of the multicast 
protocols mentioned earlier, such as IGMP, RTP etc. MBone is continually being 
developed by hundreds of researchers who are designing more effective and efficient 
protocols and applications for videoconferencing. This section gives a brief introduction to 
the MBone to provide an example of the viability of multicasting video and audio over IP- 
based network architectures. 

1. MBone Requirements 

The major technical prerequisite that makes multicasting possible over the MBone is 
the use of network routers called mrouters. Basically mrouters are upgraded commercial 
routers, dedicated UNIX workstation-class machines, or dedicated UNIX workstation-class 
machines running with modified kernels in parallel with standard commercial routers 
[Macedonia, Brutzman, 94]. More and more commercial routers are now supporting 
multicast. This will help eliminate the inefficiencies and management headaches of 
duplicate routers and tunnels [Macedonia, Brutzman, 94]. The mrouters use the IGMP 
protocol to learn the existence of host group membership on their directly attached subnets, 
to identify designated multicast routers in a LAN, and to propagate group membership 
information over the MBone. Tunneling further augments MBone by allowing multicast 
datagrams to be forwarded to other MBone subnets that support IP multicast. For example, 
at the sending mrouter, IP multicast datagrams are encapsulated by unicast IP datagrams and 
forwarded as unicast IP datagrams so that intervening unicast routers and subnets can handle 
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them. The receiving mrouters will “strip” the multicast datagram of its encapsulated unicast 
IP datagram in order to determine if any of its attached hosts are requesting to join that 
multicast group. 

As mentioned earlier, the overarching issue in videoconferencing is bandwidth. IP 
multicasting partly addresses this issue by enabling one packet of information to reach many 
destinations. For example, a 128-kilobit per second video stream (based the typical data rate 
of two channels of ISDN) uses the same bandwidth whether it is received by one location or 
20. However, there is one disadvantage. If all mrouters permitted packets to touch every 
workstation in the MBone, video streams might potentially misspend valuable bandwidth by 
sending streams to LANs that are not participants. For that reason, controls are needed to 
limit the propagation of video stream packets across the MBone. Controls of multicast 
packet propagation are implemented two ways. MBone limits the time to live (ttl) of 
multicast packets or it uses complex pruning algorithms to adaptively restrict the 
transmission of multicast packets. [Macedonia, Brutzman, 94]. MBone protocol 
developers are successfully experimenting with automatically pruning and grafting subtrees, 
and thresholds can set maximum bandwidth limits. The truncation is accomplished by 
setting the ttl in a packet. The ttl is decremented, by one or more, each time it passes 
through an mrouter. For example, if ttl was set to 16, it would multicast on a smaller scale 
such as a school campus. If the ttl was 128, it could potentially traverse most of the subnets 
on the MBone. Adjusting the ttl can assist in limiting the transmission of video stream data 
to specific regions or areas. Consequently, effective controls over the MBone can save 
precious bandwidth that the uncontrolled transmitted packets might otherwise use. 
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In order to make the MBone community a viable and efficient topology, global 
coordination is used to minimize congestion on the Internet. To add a new node to the 
MBone, a new site announces itself to its Internet Service Provider (ISP) or the MBone 
mailing list. Then, the nearest network providers decide on the most advantageous path 
connection to minimize local or regional Internet traffic. 

MBone uses various application tools in order for end-users to receive and deliver 
videoconferencing. The common applications are videoconference tool (vie), visual audio 
tool (vat), robust audio tool (rat), shared whiteboard (wb), and session directory (sdr). Vat is 
used for audio teleconferences. Shared whiteboard (wb), using T.120 protocols, can be used 
as a shared drawing surface, and it can be used to export and view postscript files. The sdr 
tool dynamically announces the availability of sessions by displaying active multicast 
groups. Sdr also launches multicast applications and automatically selects unused addresses 
for any new groups. Sdr makes announcements periodically over a well-known multicast 
address and port. 

One of the first significant uses of the MBone came about when NASA Select set up 
an in-house cable channel broadcast during space shuttle missions, which then could be 
viewed live from any MBone user’s desktop computer. 

Although many practical applications have been developed on the MBone, it 
continues to be used as a testing ground for IP multicast research and how it can be 
leveraged for distance learning. One thesis. Internetworking: Economical Storage and 
Retrieval of Digital Audio and Video for Distance Learning , [Tiddy, 96], investigates the 
usefulness and feasibility of applying networked storage of digitized video and audio, all via 
the MBone for distance learning. Currently there are prototypes that are being used to 
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deliver stored digitized data over the MBone. The Interactive Multimedia Jukebox Project, 
which can be found at http://imj.gatech.edu, is a research effort to investigate the scalable 
delivery of video-on-demand (VoD) service using multicast communication. The MBone 
VCR on Demand Project, at http://www.informatik.uni-mannheim.de/informatik 
/pi/projects/MV oD/ , offers a solution for the interactive remote recording and playback of 
multicast videoconferences. 



E. MBone ISSUES IN DISTANCE LEARNING 

Because it was originally a developmental tool, the MBone has seen limited use in 
the commercial environment, but it has already proved the great benefits of IP multicasting. 
It has great potential to grow and cover the entire Internet. Nevertheless, many network 
service providers have not enabled multicasting in many of their routers for various reasons. 
Among them is the lack of maturity of the technology, not being sure if ATM or IP (or 
combinations of both) is the direction to take, and pricing issues. Many regional network 
service providers still don’t have an MBone connection. 

MBone is not easy to set up. Enabling a router for multicasting and installing 
MBone tools is still something not normally done by network administrators. Many are 
leery about how video services will impact network bandwidth. Also, MBone tools are 
mainly developed for running on UNIX machines, and there are still problems porting the 
tools to Windows machines. Finally, the tools aren’t as user friendly as some of the 
commercial products. 
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The commercial sector is discovering the viability of multicasting and is starting to 
develop tools that are based upon the MBone standards. Companies such as Whitepine’s 
CU-SEEME and Precept’s IP/TV already have MBone-compatible applications. 
Videoconferencing applications will continue to mature, and likely the myriad of standards 
will eventually converge. This process can be accomplished more easily if the newer 
products are based upon thoroughly evaluated tools. 



F. SUMMARY 

This chapter discusses the major multicasting protocols, technologies and issues that 
are pertinent to using videoconferencing as a part of distance learning. It describes the 
baseline issues that need to be addressed in order to multicast distance learning lectures to 
numerous recipients across an IP -based network to sea. These proven protocols will make 
videoconferencing over IP networks in DoD a practical solution. One primary reason is (as 
opposed to dedicated networks) that multicast groups can be dynamically set up and tom 
down. This flexibility is needed because of the constantly changing location of end-users 
such as those receiving distance learning at sea. 

Standards like IP multicasting, and the future implementation of IPv6, will address 
some of the QoS issues by supporting resource reservation and flow setup. Also, as older 
routers are replaced or upgraded to support multicast, videoconferencing over the Internet 
and NIPRnet between groups at numerous locations will become commonplace. IPv6 is 
designed to help improve delivery of data at regular intervals, which will help address 
Quality of Service (QoS) issues. Its packet headers will help define the types of service 
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(high quality paths in underlying network) that can be used for real-time delivery of audio 
and video. 

This chapter has also shown that based upon the thorough testing and 
implementation of multicasting, it is clear that the hurdles currently facing IP multicasting 
widespread emplacement is deployment the rather than the technology. 



64 



V. IMPLEMENTING IP MULTICAST ACROSS THE NAVAL 

NETWORK ARCHITECTURE TO SEA 

A. INTRODUCTION 

This chapter provides an analysis of numerous options that can be used to leverage 
DISN for IP multicast. They will include desktop connectivity, the Unclassified but 
Sensitive Internet Protocol Router Network (NIPRnet), satellite entry points (gateways). 
Defense Satellite Communications System (DSCS) and/or C band SHF terminals (Challenge 
Athena), and Automated Digital Networking System (ADNS). 



B. BACKGROUND 

The goal of the Defense Information Infrastructure (DII) is to establish a seamless, 
secure, robust, agile, reliable and cost-effective telecommunications network that will serve 
as the end-to-end information transfer infrastructure for all DoD personnel and organizations 
worldwide [DISA, 96]. The Defense Information Systems Network (DISN) architecture, a 
component of the DII, is based upon a global network integrating existing Defense 
Communications Systems assets. Military Satellite Communications (MILSATCOM), 
Commercial SATCOM initiatives, leased telecommunications services, dedicated DoD 
Service and Defense Agency networks, and mobile/deployable networks; i.e. the 
consolidated worldwide enterprise level telecommunications infrastructure that provides the 
end-to-end information transfer component of the DII [DISA, 96]. 
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Through the Defense Information Systems Agency (DISA), DoD is continuously 
identifying what architecture and standards DISN needs for a telecommunications 
infrastructure that can support voice and video. Currently, the Defense Video Service - 
Global (DVS-G), the transport network that DISN is used for videoconferencing, is mostly a 
collection of dedicated room-based systems whose terrestrial components are connected by 
ISDN services. One other segment of DISN that can be used to support videoconferencing 
is NIPRnet. NIPRnet is an IP-based network that consists of the wide-area and local-area 
network switching and transmission systems along with customer premises equipment 
(CPE) in order to provide connectivity to DoD users. 



C. DESKTOP SYSTEMS CONNECTIVITY 

1. POTS 

Videoconferencing applications conducted on DISN over Ethernet, token-ring, or 
serial modem connections are straightforward. Under the DISN transmission services 
CONUS (DTS-G), AT&T provides information transport for the aggregate bandwidth of all 
customer Service Delivery Points homed off the Bandwidth Managers located in their 
respective access areas. Figure 5-1 is a diagram of the CONUS transmission service. To 
take advantage of the bulk transmission rates, AT&T bundles the access transmission into 
SONET for delivery to the Bandwidth Managers. At the customer access locations, 
transmission bandwidth interfaces at Tl, T3 and SONET are provided. AT&T teams with 
Local Access Providers as required to accomplish the access area bandwidth requirements. 
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Figure 5-1 DISN Architecture [DISA, 96] 



For POTS connectivity, commercial and DISN networks with dedicated dial up 
connectivity can be used. However, even with optimal desktop hardware and software, 
performance is always a question due to throughput problems associated with modem 
connections and dirty analog lines, which can cause bit errors and retransmissions. 

2. Asynchronous Digital Subscriber Line (ADSL) 

A twisted-pair phone line has a capacity far beyond the narrow 3-kHz channel used 
to carry an analog voice signal. Historically that capacity has not been used before, because 
it was reserved to compensate for signal loss in the line. A reemerging technology. 
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Asynchronous Digital Subscriber Line (ADSL) overcomes this limitation and promises to 
provide download data rates up to 8Mbps to desktops, while transmission rates will be at 
least ten times of traditional modem data rates. ADSL is a modem technology that requires 
terminal devices at each end of the phone line (user to Local Exchange Carrier — LEC). 
Because of the high frequencies ADSL uses, the distance between the modem and the 
central office plays a significant role in an ADSL modem throughput. The closer the 
modem is to the central office, the less signal degradation occurs. For example, the 
maximum distance from the central office for an 8Mbps download data rate would be 
approximately 1.7 miles, whereas 1.5 Mbps has a 3.4 mile limit. 

Computer industry leaders such as Compaq, Intel, Microsoft and phone companies 
such as Ameritech, Bell Atlantic, SBC Communications, US West, Sprint and GTE have 
joined in an alliance to promote ADSL. ADSL technology has the potential to further 
enhance desktop videoconferencing by removing the bottleneck that currently plagues many 
users connected via standard POTS. Furthermore, what makes ADSL truly attractive is that 
the infrastructure required to support it, twisted-pair copper phone lines, is already in place. 
The current problems with ADSL are its lack of availability and high equipment costs. 

3. Cable Modems 

Cable Internet access is a relatively new transport technology that is still in its early 
stage of rollout. Except for the past year, phone companies had been slow implementing 
ADSL in their central offices, which was a favorable situation for the growth and 
accessibility of cable Internet access. At the end-point, a cable modem connects to the 
cable television coaxial wiring and also attaches to the end-user’s desktop via a standard 
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Ethernet connection. Cable modems can theoretically deliver data at up to 350 times that of 
a 28.8 modem, (i.e. 10Mbps). Unlike point-to-point ADSL, cable modems are a shared 
medium, making its architecture a good fit for multicasting. Additionally, end users will not 
have to build from scratch to take advantage of multicasting. However, because cable 
modems are shared, they are bound to run into congestion problems on the wire as users fill 
up local cable loops. 

Due to technical limitations, many cable Internet services do not allow users to send 
data via the cable link. Hybrid systems, in which incoming data comes via the cable 
connection, but the outgoing data travels over the POTS modem connection are the most 
common. Therefore, this current system works well if the end-user desires to receive 
videoconferencing data, but it is not a good set-up for delivering videoconferencing content 
from the desktop. 



D. TERRESTRIAL TRANSMISSION 

1. Routing 

a. Tunneling 

When deciding what routing protocol is most effective over a network, one 
must look at the network design and topologies. While the NDPR.net (as a whole) is not 
multicast enabled, the Cisco System routers used throughout the NEPRnet can be easily 
configured to support multicast. An alternative method is to form “tunnels” between 
selected multicast-enabled routers in the CONUS segment. Subnet islands can be created, 
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similar to what is used in the MBone, to connect various end-users. These tunnels can be 



extended to gateways, that have multicast-enabled routers, with access to satellite terminals 
in order to provide a connection to remote (deployed) users. Some major ISPs (such as 
UUNET) are using tunneling to implement IP multicast across their networks. A tunnel is 
essentially a unicast virtual link that may cross several bridges and routers, which 
encapsulate multicast packets. Tunnel endpoints can be either routers supporting native 
multicast routing or workstations running the mrouted multicast daemon. 

The advantages of tunneling is that it is quick and easy to implement and 
may be the best solution when both the number of customers using IP multicast and the 
quantity of EP multicast traffic is limited. Additionally, tunneling is a cost-effective way to 
gain the benefits of multicast without adding excessive risks or making mass hardware 
changes. However, there are two major disadvantages. The first disadvantage is setting up 
and managing multicast servers or gateways. The second is that tunneling inserts the 
process of encapsulating IP Multicast datagrams into unicast IP datagrams, essentially 
slowing down the transmission and introducing scaling problems [Hurwicz, 97]. 

b. PIM-SM 

As mentioned in Chapter IV, Sparse-Mode protocols are based upon the 
assumption that the multicast group members are sparsely distributed throughout the 
network and bandwidth is not necessarily widely available. It addresses the need for a 
scalable wide-area, inter-domain, multicast routing mechanism in a large network 
infrastructure, such as NIPRnet. PIM-SM is available in Cisco System’s routers (which 
comprise most of the routers used on the NIPRnet). PIM-SM solves the routing table 
problem, found in DVMRP, by using the unicast tables for multicasting [Hurwicz, 97], but 
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there are still some drawbacks. Because unicast routes adjust automatically to equipment or 
link failures, if there are specific routes that multicast traffic should or must take, there is no 
guarantee that it will take that route. If all routers are not multicast enabled (which is 
highly likely) data may be lost. 

NASA addressed this problem on its NASA Research and Education 
Network (NREN) by moving the responsibility for the multicast network to the same groups 
that were managing the unicast network. Since the hardware usually has a decisive 
influence on the choice of multicast routing protocol, NASA uses PIM in the Cisco-based 
portions of the network, and MOSPF on the Proteon router portion, since they are oriented 
towards MOSPF [Hurwicz, 97]. 

Since distance learning via videoconferencing in the Navy will require 
data to be transmitted worldwide, PIM-SM should be seriously considered as a routing 
protocol in NIPRnet routers used for multicasting. 



2. IP over ATM 

The NIPRnet has a 10-node ATM backbone in the Continental United States that is 
connected via SONET OC-12 (622Mbps) pipes. The ATM switches provide switched 
(SVC) or permanent virtual circuits (PVC), and has promised to handle the QoS issues that 
IP multicast traditionally did not address. Therefore, instead of the IP datagrams being 
routed across the long-haul pipes, they will jump to the ATM backbone and exit at a 
NIPRnet router closest to the destination. 
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Although it has been proven that ATM has the ability to scale under high traffic 
loads, one major problem with transporting IP over ATM is that the IP datagrams have to be 
mapped to ATM protocols before it goes over the ATM backbone, and then converted back. 

Not converting IP datagrams to ATM cells eliminate three potential problems. First, 
IP-to-ATM protocols such as MPOA are complicated, and ATM is still unfamiliar to many 
network managers. Second, standards for the protocols to map IP to ATM are still not 
officially set, although they are close to being finalized. Finally, if the challenge is how to 
push more IP traffic across the data-oriented Internet, you can ignore all of the other things 
ATM is supposed to do (such as voice) and use ATM’s fast hardware for switching IP 
traffic [Dutcher, 97]. Therefore, finding economical ways of trafficking IP datagrams 
across ATM network backbones can be a plus for IP based videoconferencing applications. 

IP over an ATM network combines layer 3 scalability and flexibility with layer 2 
switching and high performance, essentially amounting to VC’s across a TCP/IP network, 
that can stream data at high speeds. Through the development of layer 3 routing in 
switches, two popular methods have emerged, IP switching and Tag switching. 

cl IP Switching 

Developed by Ipsilon Networks, IP switching software creates IP ability in 
ATM switches. The idea is to establish a path across a network. If a network of IP switches 
set up a “switched” virtual circuit (VC) among themselves across a network, they can 
improve traditional IP routing. The ATM switch acts as a router for low-duration traffic and 
as an ATM switch for long-duration flows. It is designed to allow network administrators 
determine how long a flow should be in order to activate switching instead of IP routing. 
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NASA is currently conducting studies on the use of IP switching. Its 
simulation studies have shown eighty-four percent of data packets can be IP routed 
[Breeden, 97]. 



b. Tag Switching 

Tag Switching software is developed by Cisco Systems. Working with ATM 
networks, the software tags, or maps, the current network and stores the data in routers. The 
data packets are tagged and switched as they leave their starting points (in this case 
Bandwidth Management Centers). The tags can use the Last-in-First-Out (LIFO) method at 
the switch based upon its priority designation. The tags allow the network to plot a course 
through the ATM backbone portion. The ATM switches scan the tag and then send it to the 
next switch. A tag can be an aggregate of tags, allowing an iterative process that increases 
the scalability of the network. Unlike routers, the switches will need to know the complete 
path to the edge router destination. 

One drawback is that tag switching only works with Cisco equipment. Since 
the vast majority of routers used in NIPRnet are Cisco routers, there will be no need for 
major hardware procurement to utilize this method. 
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c. 



ATM Considerations 



If either of these two aforementioned methods is used over NIPRnet’s ATM 
backbone, native routing will essentially be pushed to the periphery of the network, allowing 
IP switching or Tag switching to handle the backbone segment. Each method advertises the 
ability to provide almost the same bandwidth as ATM without having to add an extra layer 
of conversion to already time critical data. Also, IP over ATM may not only provide 
significant savings in architecture changes, but might also alleviate the need for customers 
being forced to implement ATM to the desktop, requiring even more spending. One 
potential problem with these two methods of IP over ATM is that they are still under 
development and have not proven their ability to scale under heavy network loads. 
Furthermore, most videoconferencing applications are already devoted mostly to IP. 



E. VTOECONFERENCING OVER DISN’s SATELLITE SYSTEMS 

A deployed unit’s means of transporting videoconferencing over DISN (i.e. 
NEPRnet) will be by using military and commercial SATCOM (C-band and Ku-band), Ultra 
High Frequency (UHF) and Super High Frequency (SHF) SATCOM, MILSTAR Extremely 
High Frequency (EHF) Medium Data Rate (MDR), DSCS (military), and/or C band SHF 
terminals (Challenge Athena) into an entry point or DISN gateway. To provide a gateway 
to the terrestrial segments of DISN, this integrated satellite transmission system will be 
further interconnected with the services of the Standardized Tactical Entry Point (STEP). 
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1. Space Segment 



The space segment is composed of Ultra High Frequency (UHF) SATCOM, DSCS 
II/HI multi-channel SHF 75bps - 1.5Mbps(T-l), MILSTAR Extremely High Frequency 
(EHF) Medium Data Rate (MDR) for medium data rate -- 4.8Kbps - 1.544Mbps, 
commercial SATCOM (L,C,and Ku bands) - 2.4Kbps - 8.448Mbps, and the Global 
Broadcast System (GBS), which is currently being readied. 

Since satellites are inherently broadcast by nature, an implementation of a typical 
satellite link requiring satellite terminals and military or commercial satellite resources fits 
well within the IP multicast basic model. 

Deployed units' entry point accesses are currently supported primarily at Navy 
SATCOM facilities, which serve three of the four NCTAMS. Navy access to non- 
NCTAMS sites requires circuits to be terrestrially back-hauled to the nearest NCTAMS site. 
Navy access procedures to terminal segments are described in Naval Telecommunications 
Publication (NTP)-4, NTP-2, and Communications Information Bulletins (CIBs). 



2. Terminal Segment 

Connectivity with shore communities can be leveraged using the Standard 
Tactical Data Entry Points (STEP). STEP is a Joint Staff directed upgrade to the DSCS 
portion of the Digital Communications Satellite Subsystem (DCSS) program, which is 
designed to improve and standardize Navy Tactical Satellite Communications (SATCOM). 
Fourteen DSCS sites will eventually be upgraded worldwide to provide access to DISN. 
STEP sites provide both ship-to-shore and ship-to-ship communications consisting of 
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operational and administrative traffic. These sites could be either single or dual, whereas a 
single STEP site supports one satellite coverage area while a dual STEP site supports at least 
two satellite areas. These gateways can allow at-sea units to quickly connect to the DISN 
sustaining base services that they need for videoconferencing data. Under the ITSDN 
Program, NIPRNET routers are installed at the STEP sites, with a 512Kbps-transmission 
path provided from the STEP site ITSDN router to the NIPRNET backbone. One drawback 
is that tactical access to ITSDN is provided only on a temporary basis and may require 
CINC approval. The ITSDN IP router address assignments for tactical units are obtained 
and provided by the user. 



3. Network Cache at the Gateways 

Because the ship/shore gateway is a component of the paths of many 
videoconferencing sessions travelling across the NIPRnet, storing sessions on a cache server 
offers a potentially significant savings in bandwidth and end user latency by allowing end- 
users to retrieve data at the gateway, rather than having to reach-back to the original source. 

Network caching can be used to deliver to sea video/audio from large disk caches at 
various gateways, while saving needed for bandwidth across the NIPRnet’s territorial 
backbone network. Therefore, if a student were not able to receive a videoconferencing 
session real-time, he or she might download a session from a network cache server, where 
the stored (recorded) session would be. Since personnel will be enrolled in a variety of 
courses, it must be assumed that all units will not be downloading the same information. 
Therefore this type of flexibility would require a very large disk cache to store information. 
In order manage the resources, a certain amount of digital storage space would need to be 
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allocated for each course on the cache server, and also it must be decided how long to leave 
a videoconferencing session “forward stored” on the server. For example, if a typical video 
and voice data stream, transmitted to the network cache at 300Kbps (near the upper 
transmission end of VIXS), were fifty minutes long, the storage space required for the 
lecture would be approximately 1 1 1 Mbytes. Table 5-1 shows the estimated storage space. 

300Kbps stream * 1 Byte/8 bits = 37.5KBps 
37.5KBps * 3600 seconds/hour = 135MB/hour 
135MB/hour*.825hours = 1 1 1.375MB required per lecture 
Table 5-1: Estimated Digital Storage Requirements 

If each course stored one week of lectures on a 5GB disk drive, leaving storage 
space for system operation, over 40 lectures can be stored on just that one drive. With 
digital storage expected to cost about .02 cents per MB by 1998 , cost for storage is 
minimal. 

Network cache systems could be used with the Global Broadcast System (GBS) to 
broadcast videoconferencing data to users'. The GBS space-segment is a Ka-Band 
communications payload carried aboard U.S. Navy UHF Follow-On (UFO) satellites. By 
providing reliable multicast transport data protocols with GBS, users can download 
videoconferencing sessions from a gateway, and store data locally for future use. User 
requests can be made by a slower back channel. In order to manage bandwidth over the 
space segment, each unit can be given a download-time window, or the network cache can 
be controlled to deliver content to sea only during non-peak hours. 



3 Survey taken by consulting firm Disk/Trend Inc. of Mountain View, CA. 
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F. SHIPBOARD 



1. ADNS 

Because many shipboard networks are not interoperable and require some type of 
gateway to interface with other systems, SPAWARSYSCOM has developed the Automated 
Digital Network System (ADNS) within the Joint Maritime Communications Strategy 
(JMCOMS). ADNS is attempting to convert the Navy stovepipe systems into network- 
compatible systems without incurring the cost to completely redesign and procure new 
systems for delivery data to afloat forces [Bergdahl, 96]. 

Currently the bandwidth of ADNS cannot support real-time videoconferencing, but 
as it improves bandwidth capacity, ADNS’s routing and switching system will provide the 
interface to end-user video and voice data across available RF media. The routing and 
switching subsystem should include an IP router and a suite of common multicast routing 
protocols. The routers should also support QoS protocols, such as RSVP. In order to 
prevent multicast packets from wasting unnecessary bandwidth on the shipboard LAN, 
multicast filtering switches might be used. IP multicast-enabled switches automatically set 
up filters so multicast traffic is only directed to participating end-nodes. 



G. CONCLUSION 

As shown in the chapter, the network infrastructure and technology is available to 
deliver IP multicast to sea. If used, delivering videoconferencing over DISN’s IP-based 
networks can alleviate the need for dedicated systems that require people to travel anyway. 
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Because the architecture and management systems are already in place, using IP-based 
networks can provide distance learning to a broad audience with minimal spending. 
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VI. VIDEOCONFERENCING APPLICATIONS 



A. INTRODUCTION 

This chapter discusses typical videoconferencing software and hardware that can be 
used to deliver distance learning via videoconferencing from a desktop computer over an IP- 
based network. This chapter does not endorse any particular software application(s), but is 
merely providing some examples of common tools currently available. This chapter also 
provides the recommended standards when employing desktop videoconferencing. 



B. VIDEOCONFERENCING APPLICATIONS 

Although most of the newer routers and switches are configured to support IP 
multicast, many of them are, by default, not enabled. Also, many current software 
applications are unicast and must also be modified to interface with the multicasting 
capabilities of TCP/IP stacks, which in turn, join and leave multicast groups by using IGMP 
[Hurwicz, 97]. Because companies realize that there is great potential in 
videoconferencing, these issues have not inhibited application developers from eagerly 
creating new products. 4 

Bandwidth and picture quality is still a major impediment, but other barriers like 
standardization, costs, and installation costs continue to decrease. Microsoft has embedded 
its collaboration tool, NetMeeting, in its free Internet Explorer 4.0 browser. Netscape 

4 According to Multimedia Research Group, Inc. of Sunnyvale, CA, and Fuji Keizai USA, approximately 
4,000 Web sites offered video clips in 1996. That number tripled to 12,000 in 1997 and is expected to triple 
each year for at least the next three years. 
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Communicator 4.0, which is also free, packages an analogous tool, Netscape Conference. 
Microsoft also has released a UNIX version of Internet Explorer 4.0. The MBone also used 
uses free videoconferencing desktop applications (vie, vat, sdr, wbj that are proven and 
reliable. Unfortunately, many of the commercial desktop applications, which are PC based, 
are not fully compatible with the MBone tools, which are mostly UNIX based. 

Delivering synchronous/asynchronous video and audio streams to sea not only 
requires a network architecture, but it also requires software tools that are capable of 
providing quality content to the student. Even so, quality content delivery does not replace 
the need for occasional student/instructor collaboration. Today’s desktop videoconferencing 
tools can generally be broken down into two categories. First are standards-based 
collaboration applications, which provide complete information-sharing solutions that span 
the spectrum from one-to-one to fully interactive meetings. Secondly, there are streaming 
applications that broadly distribute one-way, live or stored presentations. Desktop 
collaborating applications enable users to communicate with a small number of others, such 
as for desktop videoconferencing. Streaming applications are much more scalable, making 
it possible to reach a virtually unlimited audience. 

Streaming applications will generally have both client and server software, whereas 
collaboration applications can be client-to-client. To inititalize multipoint sessions, 
collaborative application users register their contact information with a location server. 
Fourl 1 and Microsoft’s Internet Location Server (ILS) are two examples. These servers are 
based upon Lightweight Directory Access Protocol (LDAP). 

Because audio is the most critical and sensitive aspect of videoconferencing, 
applications should provide features that allow audio adjustments to compensate for non- 
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guaranteed bandwidth. Applications must support different audio codecs in order to allocate 
certain amounts of the data stream for different bandwidths. Chat room software can be 
used as an option when voice and video are bandwidth constrained. The ability to tune 
audio during transmission, and embedded Forward Error Correction (FEC) or redundancy 
schemes, used in CU-SeeME and the MBone’s rat tool, can help minimize poor audio 
reception. 

Desktop videoconferencing collaboration applications also need a combination of 
document management capabilities, such as file sharing, white board, and snapshot tools, 
which allow users to capture whole windows or parts of windows for cutting and pasting to 
the whiteboard. Standard e-mail applications can be used for administrative purposes, such 
as setting up time for point-to-point conferencing when additional help is required. 

Multicasting videoconferencing applications use basically a straightforward 
extension to BSD 4.3 Berkley Socket API, which is supported by operating systems such as 
UNIX, and Windows 95 and NT. As these API’s become cross-platform capable, and more 
readily supported by Winsock 2, they will be ready for widespread use on PC’s, running 
OS’s such as Windows. 



C. RECOMMENDED STANDARDS 

H.323 and H.324, T.120, along with multicast protocols, such as IGMP, RTP, RTCP, 
and RSVP make up the primary standards for desktop videoconferencing systems. As an 
extension of H.320, H.323 addresses multipoint videoconferencing over ISDN, POTS, as 
well as LANs and the Internet. H.324 is the standard for real-time multimedia standards 
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over POTS. When using application from different vendors, ensure that each completely 
implements the standards it claims. For example Microsoft NetMeeting and Netscape 
Conference are both "H.323 compliant," but they do not have any common audio codecs, 
rendering them unable to talk with each other. Even with these misinterpretations, the 
standards-based support and deploying an application base required for most desktop 
videoconferencing is no longer an inhibitor. As in the past, network bandwidth and 
interoperability across different platforms are still the major problems. Dial-up with 
modems over POTS still continues to be a choke point for delivering and receiving 
videoconferencing. As H.324 matures, manufacturers will begin to build more H.324 
compliant chip sets into hardware. As of now, H.324 is acceptable for point-to-point 
collaboration, but not for supporting IP multicast. 

Although the ITU-T has provided the baseline codec standards for 
videoconferencing there are several de facto standards that have emerged. Microsoft Video 
for Windows and Apple QuickTime are common video codecs. QuickTime is compatible 
with both Windows and Macintosh environments and has been accepted by ITU-T as the 
basis for MPEG-4. The use of hardware codecs can alleviate some of the CPU usage, but 
today’s multimedia capable processors are more than capable. 

One of the first companies to market a product fully based upon IETF standards that 
relate to real-time video and audio streams, and ITU-T standards for data compression and 
decompression was Precept Software. Its Flashware Server software and IP/TV viewer 
client were initially available for Wintel based systems. Because of the implementation 
nonproprietary standards, this product can receive MBone group sessions, giving it the 
capability to interoperate with UNIX platforms. Until more companies adopt universal 
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standards, this is one of the few options for cross-platform capability between UNIX and PC 



users. 
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Table 6-1 describes the minimum standards needed for videoconferencing systems. 





LAN/WAN, 

Internet 

H.323 


POTS 

H.324 


Video 


H.261 

H.263 


H.261, 

H.263 


Audio 


G.711,G.7.22, 

G.728, 

Full-Duplex 


G.723 

Full-Duplex 


Whiteboard 


T.120,JPEG,GIF 
TIFF, Postscript, Still 
Frame Capture, File 
Transfer 


T.120, JPEG, GIF, 
TIFF, Postscript, 
Still Frame Capture, 
File Transfer 


Additional 

Features 


Chat Functions, 
Application Sharing 


Chat Functions, 
Application Sharing 


Multicasting 


RTP, RTCP, (RSVP, 
RTSP when adopted) 
Multiple Simultaneous 
Sessions 




Controls 


BW Controls(Frame- 
Rate, Image Size) 


BW Controls 


Asynchronous 

Support 


Yes 


Yes 


Additional 

Support 


Firewall 
Configuration, 
Tried Copies for 
testing 


Trial Copies for 
testing 


Router 

Support 


MOSPF 

PIM 





Table 6-1: Videoconferencing Standards over IP Networks 
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D. HARDWARE 



Today’s desktop computers provide most of the hardware components needed for 
videoconferencing. A good camera and video capture card, which can cost as little as $200, 
is all of the upgrading that is normally required. This is a markedly low price in comparison 
to roll-about and room-based systems. The release of its newer, faster multimedia based 
processors is sealing the fate of expensive hardware codecs. This is the recommended 
desktop system hardware requirement to support desktop videoconferencing: 

• Desktop w/ processor that supports multimedia 

• Digital camera for face view 5 

• Microphone 

• Speakers and/or headphones 

• 16 bit sound card, (full-duplex) 

• Video Card 

• Video capture card 6 

• Web Server 7 

• Minimum 28.8Kbps Modem 



5 Cameras that connect to video capture boards are recommended. Parallel port cameras place requires 
excessive CPU cycle time (for lesser powerful CPU’s- less than Pentium 133, use an Analog Camera). 

6 Video capture cards may include onboard codecs, but as processor power has increased, these more 
expensive boards are unnecessary. This recommendation is based upon a face-to-face conference. If a server is 
the capturing device it will use a video capture board. 

7 For Streaming Video Applications over Intemet/lntranets 
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Cameras that connect to video capture boards are recommended. Parallel cameras 
are unacceptable because of inadequate data throughput, and because they require excessive 
CPU cycle time. 



E. SUMMARY 

The ITU-T and IETF standards will likely gain broad acceptance since they are 
based upon videoconferencing over the commonly existing network architectures. In order 
for videoconferencing to gain full acceptance, H.320, H.323 and H.324 must work together 
integrated applications. 

Although desktop videoconferencing is becoming more capable, the frame rates and 
and small picture size of streaming videoconferencing applications are still lacking. If used 
in conjunction with collaborative software such as whiteboards, shared application and 
shared control, there is adequate functionality to conduct meaningful learning. 
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VII. VIDEOCONFERENCING DEMONSTRATION 



A. INTRODUCTION 

This chapter provides a proof of concept that demonstrates how current 
videoconferencing software can be used to deliver synchronous or asynchronous material 
for distance learning over an IP based network via multicast. The demonstration is follow- 
on work accomplished in Internetworking: Economical Storage of and Retrieval of Digital 
Audio and Video for Distance Learning [Tiddy, 95] and Internetworking: Worldwide 
Multicasting of the Hamming Lectures for Distance Learning [Emswiler, 95]. 



B. OVERVIEW 

Several free software tools were considered, and the one selected was the MBone 
VCR on Demand (MVoD), developed by Wieland Holfelder at the University of Mannheim, 
Germany. The MVoD is a free, experimental software solution for the interactive remote 
recording and playback of multicast videoconferences. The MVoD Service offers a 
graphical user interface (GUI) environment where the user can interactively record 
audio/video conferences on a remote server, controlling the recording session with a local 
client application. Later, that same user or other users can play the session back on demand, 
via multicast or unicast. 

Through the use of this tool, the goals of this experiment was to demonstrate: 

• A successful download and installation the MVoD Service software. 
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• Multicasting a prerecorded taped lecture over the MBone via an SGI workstation 
while recording the multicast lecture using the MVoD Service from a second 
workstation that has the MVoD Service software installed. 

• Use the MVoD Service to playback and multicast a satisfactorily replicated 
session over the MBone, which can be received by multiple users. 



C. DEMONSTRATION 

To begin the testing, the MVoD Service software was downloaded from the site 
http://www.informatik.uni-mannheim.de/informatik/pi4/projects/MVoD. Version 0.9a7 of 
the software was installed on a Silicon Graphic Indy, running ERIX 6.2 OS, 128 MB of 
RAM, running a MIPS R1000 processor. The MBone tools sdr, vie and vat were already 
installed. The MVoD architecture consists of three components: 

• The MVoD Server: handles the user and session management 

• The MVoD Client: offers the users a GUI to access the MVoD Service 

• The RTP DataPump: is responsible for the recording and playback, the 
synchronization and the administration of the RTP data streams. 

A number of internal protocols have been developed to provide communication 
between the various MvoD software components. They include the: 

• VCR Announcement Protocol (VCRAP)- the server announces its services to all 
clients. 

• VCR Service Announcement Protocol (VCRSAP)- the clients have access to the 
server. 

• VCR Stream Control Protocol (VCRSCP)- the client use to access and control a 
session on the server. 
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• RTP DataPump Control Protocol (RDCP)- the server uses to control the RTP 
DataPumps (one per session). 

An interface has also been implemented with the Session Announcement Protocol 
[Perkins, 97], which is used by the MBone tool sdr, in order for the MVoD server to learn 
about ongoing MBone sessions. Figure 7-1 is the MVoD architecture with its various 
protocols, which are used in conjunction with MBone tools. Detailed explanations of the 
various protocols can be found at the web site. 
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Figure 7-1 MVoD Architecture [Hoifeider, 97] 



The testing was accomplished using two SGI workstations (Indy and Octane models) 
on the NPS LAN. The test lectures for the multicast transmission, which had been 
developed from the thesis "Internetworking Worldwide Multicast of the Hamming Lectures 
for Distance Learning ” (Emswiler, 95), were input to an SGI Indy workstation (blacknoise) 
from the line output of a VCR . The MVoD Service was running on an adjacent 
workstation (electric). The MVoD Service, and the MBone’ s sdr, vat and vie were the 
software used for the experiment. The MBone tools are also free and can be downloaded 
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form many ftp sites that provide MBone tools. The MBone tools used have already been 
proven effective, therefore the focus of the Chapter will be on the effectiveness of the 
MVoD recording and playback processes. 



D. RECORDING A BROADCAST 

The first step was to set up the workstation expected to multicast the lecture over the 
MBone by providing video and audio line connections from the VCR. The sdr tool on 
blacknoise was used to create a new MBone session. The video and audio source of the 
MBone transmission was provided by a VCR that played back the Hamming Lecture Series. 
Once the VCR was connected and the session created, the lecture was multicast over the 
MBone using vie and vat (RTPv2). For the multicast, default bandwidth settings for vie 
H.261 (128Kbps) and vat PCM audio (64Kbps) were used. The time-to-live (ttl) was set to 
15, in order to keep the transmission restricted to the campus LAN. 

On the workstation “electric,” the MVoD Service was running. The MVoD client 
GUI was used to control the MVoD server and RTP data pump. The May 26 th 1995 lecture 
was recorded by MvoD using a 128kbps (maximum) vie video stream, which lasted for 37 
minutes. After the session was recorded, the file size of the recording was noted. Based 
upon the five files that the MVoD server creates for each recorded session, the total file size 
for the transmission was approximately 46 MBs. Therefore the recording averaged 1.24MB 
per minute of data stored. If a standard 50-minute lecture were held, the storage 
requirements would be approximately 62 MBs. An expected file size is thus approximately 
75MB per hour. (This size fits conveniently inside of a 100MB zip disk). 
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E. PLAYING BACK AN MBone RECORDED SESSION 



The next step in evaluating te MVoD Service was to play back (multicast) the 
recorded session while simultaneously transmitting it over the MBone. Using the MVoD 
client GUI on electric, a list of sessions previously recorded (which was only one, in our 
case) by the server was displayed. Once the session was selected, the GUI also provided 
the option of playing back either audio, video or both mediums. Both audio and video were 
selected. When the play button was clicked, the session was multicast over the MBone and 
vie and vat were automatically launched locally in order for the person playing back the 
session to observe it. The transmission used the same bandwidth settings that were used 
during the original session and can not be changed. 

The rebroadcast (play back) of the session was observed using vie and vat tools on 
“electric” and “blacknoise.” From the observation, there was no discemable difference 
between the recorded session and the original. There was no packet loss due to the fact that 
there was no congestion on the LAN containing the multicasting and receiving workstations. 



F. EVALUATION OF RESULTS 

Currently MvoD only runs on UNIX systems. For a user having little experience 
with UNIX command lines and environment variables, the MVoD tool is not easy to install. 
Therefore it is recommended that only System Administrators or experienced UNIX users 
install the software. During the initial installation, there were problems with killing 
processes. For instance, some processes could not access sockets even after prior process at 
the socket had been killed. After becoming more proficient with the tool, and properly 
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shutting it down, this was no longer an issue. Further development of MvoD may make 
installation simpler, and will likely provide a Windows version as well. 

One result important to note that having too many applications running on the 
workstation slowed down the CPU cycle time, effectively slowing down the compression 
rate of the transmission. In all of the playbacks (multicasts), the default transmission rates 
on the audio and video provided a clear reproduction of the original audio/video session. 
No experiments were conducted using the using the MBone wb tool. Whiteboard recording 
is not likely to occur soon due to the distributed asynchronous nature of events. 

The results of the audio and video testing are satisfactory and demonstrate the 
successful recording and payback (multicast) of a distance learning lecture using the MVoD 
Service. 

G SUMMARY 

The results of this experiment proved that the technology exists for software tools 
available to receive, archive, and retransmit distance learning lectures. Once set up 
properly, the software provides a simple GUI that is easy to use, and not only provides 
playback on demand but also recording on demand. Being able to record content for future 
use enables users to build a local library of distance learning content. 

The MVoD tool, or a similar tool, can be used to remotely record an instructor’s 
lecture. MvoD could be set up to perform as described in Chapter V. A student can use the 
MBone tools to connect to the session during the live broadcast, or use the MVoD client 
GUI to receive a prerecorded session at a more convenient time. If bandwidth over the 
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network segment is restricted, which may often be the case, users can ftp the session from 
the cache server for local playback. 



95 






96 



Vin. CONCLUSIONS AND RECOMMENDATIONS 



A. SUMMARY OF FINDINGS 

The underlying premise of this thesis is that desktop videoconferencing can be 
implemented over the currently available DISN IP-based networks instead of dedicated 
point-to-point, expensive, room based systems that can not provide the scalability necessary 
to deliver distance learning to a broad, globally dispersed audience. IP multicast is 
designed to scale well as the number of participants and collaborations expand so that 
adding one more user doesn’t amount to adding a corresponding amount of bandwidth. It 
doesn’t cost any more or require any more bandwidth for 100,000 viewers than it does for 
one. This fits well with desire to deliver distance learning to numerous participants. 

Just within the past two years videoconferencing technology has made enormous 
strides, and the current capability to implement real time, off-the-shelf or free standards 
based products has advanced greatly beyond what was available in the past. There are 
sufficient, well-tested standards that can be used in DP based videoconferencing. Desktop 
videoconferencing via IP-based networks in the DII is a viable tool that can add numerous 
economical benefits, such as a decreased spending for travel and eliminating the need to rely 
on large, room-based videoconferencing systems. 
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B. RECOMMENDATION FOR FUTURE RESEARCH 



This thesis provides a preliminary study on the technological and economic benefits 
of implementing IP multicast videoconferencing technology from desktops to remote 
locations. As part of the strategic planning process, additional research is needed to 
determine the bandwidth parameters, such as latency, delay, on videoconferencing 
technology within the DISN. Additional research is required in the areas of: 

• comparing ATM multicast to IP switching and its viability in wide-scale 
videoconferencing 

• conduct a comparison of current desktop videoconferencing software in its 
implementation in distance learning. 

• determine the feasibility of tunneling over NIPRnet. 

• setting up a course and delivering its contents using the MVoD Service is 
another area of research that can provide an actual demonstration of distance 
learning from the desktop. 

• how network caching and web hosting can be used in videoconferencing 

• the implementation of RSVP and RTSP over the NIPRnet. 
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APPENDIX A. GLOSSARY OF TERMS 



API: Application Programming Interface; the generalized term for a defined software 
interface for software applications. 

Asynchronous Transfer Mode (ATM): A connection-oriented technology defined by 
the ITU and the ATM Forum. At the lowest level, ATM sends all data in fixed cells with 
48 octets of data plus five octets of header information, per cell. 

Autonomous System: A network controlled by a single administrative authority; a 
routing domain. 

Broadcast: The sending of information from one to all hosts in a LAN network. 

Class A: A type of unicast IP address that segments the address space into many network 
addresses and few host addresses. 

Class B: A type of unicast IP address that segments the address space into a medium 
number of network and host addresses. 

Class C: A type of unicast IP address that segments the address space into many host 
addresses and few network addresses. 

Class D: Multicast IP group addresses. 

Connectionless: Term used to describe data transfer without the existence of a virtual 
circuit. UDP is connectionless and provides best effort- unreliable delivery. 

CRC: Cyclic Redundancy Check; a mechanism to detect errors in frames. 

Ethernet: An industry LAN standard sponsored by DEC, Xerox, and Intel in the early 
80s. Became the basis for the official IEEE 802.3 LAN standard. 

Frame: The link-layer data entity; data is packaged in frames, for the purpose of 
transmission over a network. Frames are bounded by flag characters or some other 
delimiter. 

H.320: An ITU-T umbrella of standards for videoconferencing over narrow-band circuit- 
switched WAN services such as ISDN. 

H.323: An extension of H.320, it covers videoconferencing not only over narrow-band 
WAN services, but also on packet-switched networks, such as LANs and the Internet. 

H.324: The ITU-T’s standard for real-time multimedia over standard POTS lines using 
28.8Kbps V.34 modems or better. 
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Host: The generalized term for any device that can be a source or sink of information on 
a network. Generally, a host is a single-networked computer. 

IETF: Internet Engineering Task Force; the body associated with the Internet that 
recommends and approves "standards" for use on the Internet. 

IGMP: Internet Group Management Protocol, the protocol with which hosts 
communicate with the nearest router supporting multicast to notify them about 
membership in a multicast group. 

IP: Internet Protocol; the network layer (layer 3) of TCP/IP. Network layer addresses are 
used by routers for routing purposes. 

ITU-T: The Telecommunications Standardization Sector of the International 

Telecommunications Union, a body of the United Nations which controls the standards 
for telephone systems: 

MAC: Media Access Control; the protocol used in a LAN or other shared transmission 
media for gaining access to the media. 

MBone: Multicast Backbone is a virtual, experimental network that runs on top of the 
internet to provide multicasting of live video and audio around the world. 

Multicast: The sending of information from one to many, but not all members of a 
network. See RFC 1112. 

Multicast Group: A group set up to send and receive messages from multiple sources 
and receivers. These groups can be set up based on frame relay or IP in the TCP/IP 
protocol suite, as well as in other networks. 

OSI Model: A seven-layer model of data communications protocols standardized by the 
International Standards Organization (ISO). 

PVC: Permanent Virtual Circuit; a permanent logical connection set up with packet data 
networks such as frame relay or ATM. 

RFC: Request for Comment; the document that the IETF uses to define standards for use 
and recommend practices in the Internet. 

RTP v2: Real-Time Transport Protocol Version 2 is a real-time transport protocol that 
provides end-to-end delivery of services to support applications transmitting real-time 
data, for example, interactive video and audio, over unicast and multicast network 
services. See RFCs 1889 and 1890. 
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RTCP: Real-Time Control Protocol is a control protocol used in conjunction with RTP. 
RTCP provides information to applications, identify RTP resources, control RTCP 
transmission intervals, and conveys minimal session control information. See RFCs 1889 
and 1890. 

RSVP: Resource Reservation Protocol is an experimental resource reservation set up 
protocol designed for an integrated services network, that is currently under development. 
An application might invoke RSVP to request specific end-to-end QoS for a data stream. 

SVC: Switched Virtual Circuit; a switched logical connection set up on a temporary basis 
with packet data networks such as frame relay or ATM. 

TCP/IP: The protocol suite used in the Internet. The most important protocol suite used 
in networking. 

TTL (time to live): A counter that is decremented each time a packet passes through a 
router. 

Unicast: The sending of information from one to one in a network; point-to-point data 
packet communication. 
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APPENDIX B INSTRUCTION FOR the BASIC OPERATION OF THE 
MBone VCR on DEMAND SERVICE (MvoD) 



This user’s guide has been developed from experience and the information 
provided by the HTML files that accompany the actual the MBone VCR Service 
program. It is also a follow-on guide of the MVoD instruction manual from 
Internetworking: Economical Storage of and Retrieval of Digital Audio and Video for 
Distance Learning (Tiddy, 95). It is designated to provide basic assistance to anyone that 
desires to use the MBone VCR Service to record or playback a multicast session, and is 
in no way all encompassing. There are no instructions in this appendix for operating 
MBone tools. Information about the MBone can be found at The MBone Information 
Web available at http://www.MBone.com/. 



A. OVERVIEW OF THE MVoD SERVICE 

During the recording, the MVoD Service will synchronize the data streams based 
upon the information provided by the RTPv2 protocol. As with any multicast capable 
application, the MVoD Service does no need to know the source address of a data stream 
or the exact content of the data stream, as long as the data stream conforms to the 
protocols supported by the MVoD Service. 

A session recorded by the MVoD Service can be one of as many as 100 multicast 
sessions that a user desires to record. As many as 20 clients can access the server 
simultaneously. 
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To playback a recorded session, the MVoD Service RTP data pump sends the data 



out to the network, recovering the original timing and synchronization of all the media 



streams included in this session and using the same network protocols used by the 



applications from which the data was recorded. The MVoD interface is shown in Figure 



B-l. 




Figure B-l MBone VCR on Demand Interface [Holfelder, 98] 
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B. DOWNLOADING THE TOOLS 



The MVoD Service software can be downloaded from http://www.informatik.uni- 
mannheim.de/informatik/pi4/projects/MVoD. The site contains a description of the 
service as well as a source for the various versions of the tools (based the UNIX 
workstation). The version described in this manual is 0.9a7, downloaded to Silicon 
Graphics Indy, running IRIX 6.2 OS, 128 MB of RAM, running a MIPS R1000 
processor. It also ran on a more powerful SGI Octane workstation. The workstation that 
the MVoD Service is downloaded to must have JDK 1.1.4 or higher in order to run the 
client and server components. This resource can be found at http://www.javasoft.com 
/nav /download. 

Once the tool has been downloaded, it must be unzipped, using gunzip, and then 
un-tarred using the tar -xvf command line to install it on the local workstation. The 
readme file will be included. It will provide detailed instructions for installing and 
running the MVoD service. 



C. USING THE MVoD SERVICE 

The following sections describe the basic functions available to the users of the 
MVoD Service client, and assume that the system administrator has already properly 
installed the MVoD Server. Additional information can be found in the readme files. 
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1 . 



Connect to a Server 



The first thing that a user needs to do is connect to the desired MVoD server. The 
GUI will list the servers that the clients will be able to access. The servers announce 
themselves via the VCR Announcement Protocol (VCRAP). From the list of servers in 
the left window, highlight the desired server. On the toolbar, select the computer icon, 
and that will connect the client to the server. Then, the user will need to log on to the 
MVoD server. 



2. Select an MBone session 

Below the left window, select from the drop down menu, “SAP announcements.” 
This will show the user a list of the current MBone sessions that the sdr is advertising to 
the MVoD Server. Highlight the desired session. Then go to the Session drop down 
menu and select "Connect to session," or go up to the toolbar and click the tape icon. 
This step will connect the user to the desired MBone session. At this point the RTP 
DataPump will create five files related to that particular session in a directory called data 
(the location of directories is explained in the readme file). In the data directory, you 
will find one session description file (*.rdcp) for every session and two files (*.rec and 
*.idx) per media in a session in this directory. An index file ends in .idx and a data file 
ends in .rec. The filenames for these files are automatically generated out of the session 
filename and the corresponding rdcp-id. For example, given that a session stored in the 
session file whd-007.rdcp consists of one media with rdcp-id 0 and and one media with 
rdcp-id 1 . Then the automatically generated media files would be: whd-007-0.idx, whd- 
007-0.rec, whd-007-l.idx and whd-007-l.rec. The content of the .rec file is more or less 
the raw rtp-data dumped into the file as it was received from the network. The .idx file 
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contains a fixed-length header per data packet that holds a mapped timestamp generated 
from information of the rtp-timestamps, an offset to the corresponding rtp-data packet in 
the .rec file and a few other details. 

The user will also notice that, once the connection is made, the MBone VCR 
record function button, located on the lower right of the display will become enabled, and 
the left window will display the media (video and/or audio) associated with that session. 
At this point the user is ready to record the session that he or she is connected to. 



3. Recording a Session 

Once the user is connected to the session, and has verified that the data is being 
transmitted over the MBone, select the red record button . 8 The MVoD data files for the 
session are now being recorded and stored in the data directory. To stop the recording, 
use the left mouse button to click the square, black stop function button. 

With most of the MBone tools, you can not record data that is sent from the same 
host where the RTP DataPump daemon is running (e.g. with vat, vie) because these tools 
do not perform so-called local loopback. However, for playback you can run the RTP 
DataPump daemon and the MBone tools on the same host since the RTP DataPump does 
not turn off local loopback. 



8 By default MVoD does not start to record if it does not receive a data signal from any of the media in the 
session. To start recording when no data is present, select the “Recording without signal” button from the 
Options drop down menu. Once the button is selected, the digital timer on the right of the display will 
activate. 
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In other words, you can not run the source multicast transmission and recording 



client on the same machine, but no single-machine restrictions exist during palyback. 



4. Editing a Session or Media 

In order to edit a session that has already been loaded and created, the user must 
display the available sessions. Click on the drop down menu on the lower left of the 
display and select “Recorded Sessions”. The left window will display the sessions that 
have been stored (recorded). Select the session that is desired. Connect to the session by 
clicking the tape icon. Once connected to the session, the media types recorded from the 
MBone session will be displayed in the left window. 

a. Mute a Media 

A single click with the left mouse button on the media list will select a 
media so it can be muted/unmuted with the “mute/unmute” icon or the "mute/unmute" 
selection under the Media dropdown list. If the media is muted, angle brackets < > 
surrounding it. 



5. Play a Session 

To play a session back, simply click the “play” button. In order to listen to and/or 
watch the data, MBone tools vie and vat need to be launched. They can be launched by 
selecting the “Tools” dropdown list and then the “start MBone tools”. To stop the 
session, click the stop function button. 
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6 . 



Fast Forward and Rewind 



To fast forward (ff) or rewind (rew) a session, click on the “ff” button or the 
“rew” button. 



7. Random Access with the Session Slider 

The slider on the lower part of the display enables random access within the 
session. Clicking with the middle button somewhere in the slider will forward or rewind 
the session to this point. Clicking the left mouse button on the slider to the left of the 
marker will rewind the session about one minute. Clicking on the left mouse button to 
the right of the marker will forward the session about one minute. At the lower right 
comer of the display, the total length of the session is displayed. 

8. Loop Mode 

If the “Loop Mode” entry in the “Options” drop down list is selected during 
playback, the session will start all over from the beginning when it reaches the end. This 
feature allows continuous transmissions. 
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9 . 



Quick-Keys 



The following quick-keys are available for the MVoD Client. They will probably 
continue to change as the product matures. 



Key 


Meaning 


q 


quit 


backspace 


go back one level 


P 


P la y 


shift-p 


play at 


s 


stop 


shift-s 


stop at 


r 


record 


shift-r 


record at 


e 


edit session 


i 


info about session 


t 


start tools 


shift-t 


start all tools 
automatically 


1 


loop-mode on 


shift-1 


loop-mode off 


right 


forward one second 


shift-right 


forward 10 second 


Ctrl-right 


forward 1 minute 


left 


back one second 


shift-left 


back 10 second 


ctrl-left 


back 1 minute 


UP 


goto end 


Down 


goto start 



D. KNOWN BUGS and SHORTFALLS 



This demonstration used Version 0.9a7, downloaded to a Silicon Graphic Indy, 
running IRIX 6.2 OS, 128 MB of RAM, running a MIPS R1000 processor. This version 
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of the MVoD service is more user friendly, because it uses the GUI interface to alleviate 
many of the manual inputs required in the (Tiddy, 96) instruction guide. This section 
delineates the known bugs and some of the shortcomings of the MVoD Service. 

• The MVoD versions for the SUN workstation could not be untarred. A 
checksum error was displayed. 

• Whiteboard (wb) is not yet supported. 



F. SUMMARY 

Once installed, the MvoD tool is easy to operate. The GUI is user friendly, and 
provides context help in the status line, depending on the state of the client and the area 
the mouse pointer. Although not all encompassing, this instruction manual can aid a new 
user with simple operation of the MVoD client. 



Ill 
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